idnits 2.17.1 draft-bagnulo-behave-nat64-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 19. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 1399. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1410. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1417. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1423. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 19, 2008) is 5691 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'I-D.ietf-mmusic-ice' is defined on line 1347, but no explicit reference was found in the text == Unused Reference: 'RFC3498' is defined on line 1353, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2671 (Obsoleted by RFC 6891) ** Obsolete normative reference: RFC 2765 (Obsoleted by RFC 6145) == Outdated reference: A later version (-12) exists of draft-ietf-behave-nat-icmp-08 -- Obsolete informational reference (is this intentional?): RFC 2766 (Obsoleted by RFC 4966) Summary: 3 errors (**), 0 flaws (~~), 4 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BEHAVE WG M. Bagnulo 3 Internet-Draft UC3M 4 Intended status: Standards Track P. Matthews 5 Expires: March 23, 2009 Unaffiliated 6 I. van Beijnum 7 IMDEA Networks 8 September 19, 2008 10 NAT64/DNS64: Network Address and Protocol Translation from IPv6 Clients 11 to IPv4 Servers 12 draft-bagnulo-behave-nat64-01 14 Status of this Memo 16 By submitting this Internet-Draft, each author represents that any 17 applicable patent or other IPR claims of which he or she is aware 18 have been or will be disclosed, and any of which he or she becomes 19 aware will be disclosed, in accordance with Section 6 of BCP 79. 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF), its areas, and its working groups. Note that 23 other groups may also distribute working documents as Internet- 24 Drafts. 26 Internet-Drafts are draft documents valid for a maximum of six months 27 and may be updated, replaced, or obsoleted by other documents at any 28 time. It is inappropriate to use Internet-Drafts as reference 29 material or to cite them other than as "work in progress." 31 The list of current Internet-Drafts can be accessed at 32 http://www.ietf.org/ietf/1id-abstracts.txt. 34 The list of Internet-Draft Shadow Directories can be accessed at 35 http://www.ietf.org/shadow.html. 37 This Internet-Draft will expire on March 23, 2009. 39 Abstract 41 NAT64 is a mechanism for translating IPv6 packets to IPv4 packets and 42 vice-versa. DNS64 is a mechanism for synthesizing AAAA records from 43 A records. These two mechanisms together enable client-server 44 communication between an IPv6-only client and an IPv4-only server, 45 without requiring any changes to either the IPv6 or the IPv4 node, 46 for the class of applications that work through NATs. They also 47 enable peer-to-peer communication between an IPv4 and an IPv6 node, 48 where the communication can be initiated by either end using 49 existing, NAT-traversing, peer-to-peer communication techniques. 50 This document specifies NAT64 and DNS64, and gives suggestions on how 51 they should be deployed. 53 Table of Contents 55 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 56 1.1. Features of NAT64 . . . . . . . . . . . . . . . . . . . . 3 57 1.2. Overview . . . . . . . . . . . . . . . . . . . . . . . . . 4 58 1.2.1. NAT64 solution elements . . . . . . . . . . . . . . . 5 59 1.2.2. Walkthough . . . . . . . . . . . . . . . . . . . . . . 7 60 1.2.3. Dual stack nodes . . . . . . . . . . . . . . . . . . . 9 61 1.2.4. IPv6 nodes implementing DNSSEC . . . . . . . . . . . . 10 62 1.2.5. Filtering . . . . . . . . . . . . . . . . . . . . . . 10 63 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 10 64 3. Normative Specification . . . . . . . . . . . . . . . . . . . 12 65 3.1. Synthentic AAAA RRs . . . . . . . . . . . . . . . . . . . 12 66 3.2. The EDNS SAS option . . . . . . . . . . . . . . . . . . . 13 67 3.3. DNS64 . . . . . . . . . . . . . . . . . . . . . . . . . . 14 68 3.4. NAT64 . . . . . . . . . . . . . . . . . . . . . . . . . . 15 69 3.4.1. Determining the Incoming 5-tuple . . . . . . . . . . . 17 70 3.4.2. Filtering and Updating Session Information . . . . . . 17 71 3.4.2.1. UDP Session Handling . . . . . . . . . . . . . . . 18 72 3.4.2.2. TCP Session Handling . . . . . . . . . . . . . . . 18 73 3.4.3. Computing the Outgoing 5-Tuple . . . . . . . . . . . . 18 74 3.4.4. Translating the Packet . . . . . . . . . . . . . . . . 20 75 3.4.5. Handling Hairpinning . . . . . . . . . . . . . . . . . 21 76 3.5. FTP ALG . . . . . . . . . . . . . . . . . . . . . . . . . 21 77 4. Application scenarios . . . . . . . . . . . . . . . . . . . . 21 78 4.1. Enterprise IPv6 only network . . . . . . . . . . . . . . . 21 79 4.2. Reaching servers in private IPv4 space . . . . . . . . . . 22 80 5. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 23 81 5.1. About the Prefix used to map the IPv4 address space 82 into IPv6 . . . . . . . . . . . . . . . . . . . . . . . . 23 83 6. Security Considerations . . . . . . . . . . . . . . . . . . . 25 84 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 27 85 8. Changes from Previous Draft Versions . . . . . . . . . . . . . 27 86 9. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 27 87 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 28 88 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 28 89 11.1. Normative References . . . . . . . . . . . . . . . . . . . 28 90 11.2. Informative References . . . . . . . . . . . . . . . . . . 29 91 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 29 92 Intellectual Property and Copyright Statements . . . . . . . . . . 31 94 1. Introduction 96 This document specifies NAT64 and DNS64, two mechanisms for IPv6-IPv4 97 transition and co-existence. Together, these two mechanisms allow a 98 IPv6-only client to initiate communications to an IPv4-only server, 99 and also allow peer-to-peer communication between IPv6-only and IPv4- 100 only hosts. 102 NAT64 is a mechanism for translating IPv6 packets to IPv4 packets. 103 The translation is done by translating the packet headers according 104 to SIIT [RFC2765], translating the IPv4 server address by adding or 105 removing a /96 prefix, and translating the IPv6 client address by 106 installing mappings in the normal NAT manner. 108 DNS64 is a mechanism for synthesizing AAAA resource records (RR) from 109 A RR. The synthesis is done by adding a /96 prefix to the IPv4 110 address to create an IPv6 address, where the /96 prefix is assigned 111 to a NAT64 device. 113 Together, these two mechanisms allow a IPv6-only client to initiate 114 communications to an IPv4-only server. 116 These mechanisms are expected to play a critical role in the IPv4- 117 IPv6 transition and co-existence. Due to IPv4 address depletion, 118 it's likely that in the future, a lot of IPv6-only clients will want 119 to connect to IPv4-only servers. The NAT64 and DNS64 mechanisms are 120 easily deployable, since they require no changes to either the IPv6 121 client nor the IPv6 server. For basic functionality, the approach 122 only requires the deployment of NAT64-enabled devices connecting an 123 IPv6-only network to the IPv4-only Internet, along with the 124 deployment of a few DNS64-enabled name servers in the IPv6-only 125 network. However, some advanced features require software updates to 126 the IPv6-only hosts. 128 The NAT64 and DNS64 mechanisms are related to the NAT-PT mechanism 129 defined in [RFC2766], but significant differences exist. First, 130 NAT64 does not define the NATPT mechanisms used to support IPv6 only 131 servers to be contacted by IPv4 only clients, but only defines the 132 mechanisms for IPv6 clients to contact IPv4 servers and its potential 133 reuse to support peer to peer communications through standard NAT 134 traversal techniques. Second, NAT64 includes a set of features that 135 overcomes many of the reasons the original NAT-PT specification was 136 moved to historic status [RFC4966]. 138 1.1. Features of NAT64 140 The features of NAT64 and DNS64 are: 142 o It enables IPv6-only nodes to initiate a client-server connection 143 with an IPv4-only server, without needing any changes on either 144 IPv4 or IPv6 nodes. This works for the same class of applications 145 that work through IPv4-to-IPv4 NATs. 147 o It supports peer-to-peer communication between IPv4 and IPv6 148 nodes, including the ability for IPv4 nodes to initiate 149 communcation with IPv6 nodes using peer-to-peer techniques (i.e., 150 using a rendezvous server and ICE). To this end, NAT64 is 151 compliant with the recommendations for how NATs should handle UDP 152 [RFC4787], TCP [I-D.ietf-behave-tcp], and ICMP 153 [I-D.ietf-behave-nat-icmp]. 155 o Compatible with ICE. 157 o Supports additional features with some changes on nodes. These 158 features include: 160 * Support for DNSSec 162 * Some forms of IPSec support 164 * Increased ability to detect when there is a communication path 165 that does not involve translating between IPv6 and IPv4. This 166 is achieved by marking synthetic DNS AAAA resource records 167 which usage would result in translated connectivity, so that 168 the sender can prefer using non-synthetic records when it is 169 possible. 171 1.2. Overview 173 This section provides a non-normative introduction to the mechanisms 174 of NAT64 and DNS64. 176 NAT64 mechanism is implemented in an NAT64 box which has two 177 interfaces, an IPv4 interface connected to the the IPv4 network, and 178 an IPv6 interface connected to the IPv6 network. Packets generated 179 in the IPv6 network for a receiver located in the IPv4 network will 180 be routed within the IPv6 network towards the NAT64 box. The NAT64 181 box will translate them and forward them as IPv4 packets through the 182 IPv4 network to the IPv4 receiver. The reverse takes place for 183 packets generated in the IPv4 network for an IPv6 receiver. NAT64, 184 however, is not symmetric. In order to be able to perform IPv6 - 185 IPv4 translation NAT64 requires state, binding an IPv6 address and 186 port (hereafter called an IPv6 transport address) to an IPv4 address 187 and port (hereafter called an IPv4 transport address). 189 Such binding state is created when the first packet flowing from the 190 IPv6 network to the IPv4 network is translated. After the binding 191 state has been created, packets flowing in either direction on that 192 particular flow are translated. The result is that NAT64 only 193 supports communications initiated by the IPv6-only node towards an 194 IPv4-only node. Some additional mechanisms, like ICE, can be used in 195 combination with NAT64 to provide support for communications 196 initiated by the IPv4-only node to the IPv6-only node. The 197 specification of such mechanisms, however, is out of the scope of 198 this document. 200 1.2.1. NAT64 solution elements 202 In this section we describe the different elements involved in the 203 NAT64 approach. 205 The main component of the proposed solution is the translator itself. 206 The translator has essentially two main parts, the address 207 translation mechanism and the protocol translation mechanism. 209 Protocol translation from IPv4 packet header to IPv6 packet header 210 and vice-versa is performed according to SIIT [RFC2765]. 212 Address translation maps IPv6 transport addresses to IPv4 transport 213 addresses and vice-versa. In order to create these mappings the 214 NAT64 box has two pools of addresses i.e. an IPv6 address pool (to 215 represent IPv4 addresses in the IPv6 network) and an IPv4 address 216 pool (to represent IPv6 addresses in the IPv4 network). Since there 217 is enough IPv6 address space, it is possible to map every IPv4 218 address into a different IPv6 address. 220 NAT64 creates the required mappings by using as the IPv6 address pool 221 a /96 IPv6 prefix (hereafter called Pref64::/96). This allows each 222 IPv4 address to be mapped into a different IPv6 address by simply 223 concatenating the /96 prefix assigned as the IPv6 address pool of the 224 NAT64, with the IPv4 address being mapped (i.e. an IPv4 address X is 225 mapped into the IPv6 address Pref64:X). The NAT64 prefix Pref64::/96 226 is assigned by the administrator of the NAT64 box from the global 227 unicast IPv6 address block assigned to the site. It should be noted 228 that the the prefix used as the IPv6 address pool is assigned to a 229 specific NAT64 box and if there are multiple NAT64 boxes, each box is 230 allocated a different prefix. Assigning the same prefix to multiple 231 boxes may lead to communication failures due to internal routing 232 fluctuations. 234 The IPv4 address pool, however, is a set of IPv4 addresses, normally 235 a small prefix assigned by the local administrator to the NAT64's 236 external (IPv4) interface. Since IPv4 address space is a scarce 237 resource, the IPv4 address pool is small and typicaly not sufficient 238 to establish permanent one-to-one mappings with IPv6 addresses. So, 239 mappings using the IPv4 address pool will be created and released 240 dynamically. Moreover, because of the IPv4 address scarcity, the 241 usual practice for NAT64 is likely to be the mapping of IPv6 242 transport addresses into IPv4 transport addresses, instead of IPv6 243 addresses into IPv4 addresses directly, which enable a higher 244 utilization of the limited IPv4 address pool. 246 Because of the dynamic nature of the IPv6 to IPv4 address mapping and 247 the static nature of the IPv4 to IPv6 address mapping, it is easy to 248 understand that it is far simpler to allow communication initiated 249 from the IPv6 side toward an IPv4 node, which address is permanently 250 mapped into an IPv6 address, than communications initiated from IPv4- 251 only nodes to an IPv6 node in which case IPv4 address needs to be 252 associated with it dynamically. For this reason NAT64 supports only 253 communications initiated from the IPv6 side. 255 An IPv6 initiator can know or derive in advance the IPv6 address 256 representing the IPv4 target and send packets to that address. The 257 packets are intercepted by the NAT64 device, which associates an IPv4 258 transport address of its IPv4 pool to the IPv6 transport address of 259 the initiator, creating binding state, so that reply packets can be 260 translated and forwarded back to the initiator. The binding state is 261 kept while packets are flowing. Once the flow stops, and based on a 262 timer, the IPv4 transport address is returned to the IPv4 address 263 pool so that it can be reused for other communications. 265 To allow an IPv6 initiator to do the standard DNS lookup to learn the 266 address of the responder, DNS64 is used to synthesize an AAAA record 267 (pronounced "quad-A" and containing an IPv6 address) from the A 268 record (containing the real IPv4 address of the responder). DNS64 269 receives the DNS queries generated by the IPv6 initiator. If there 270 is no AAAA record available for the target node (which is the normal 271 case when the target node is an IPv4-only node), DNS64 performs a 272 query for the A record. If an A record is discovered, DNS64 creates 273 a synthetic AAAA RR by adding the Pref64::/96 of a NAT64 to the 274 responder's IPv4 address (i.e. if the IPv4 node has IPv4 address X, 275 then the synthetic AAAA RR will contain the IPv6 address formed as 276 Pref64:X). The synthetic AAAA RR is passed back to the IPv6 277 initiator, which will initiate an IPv6 communication with the IPv6 278 address associated to the IPv4 receiver. The packet will be routed 279 to the NAT64 device, which will create the IPv6 to IPv4 address 280 mapping as described before. 282 Having DNS synthesize AAAA records creates a number of problems, as 283 described in [RFC4966]: 285 o The synthesized AAAA records may leak outside their intended 286 scope; 288 o Dual-stack hosts may communicate with IPv4-only servers using IPv6 289 which is then translated to IPv4, rather than using their IPv4 290 connectivity; 292 o The IPv6-only hosts will be unable to use DNSSEC to verify the 293 legitimacy of the synthetic AAAA records. 295 In order to avoid these issues, responses containing synthesized 296 addresses are tagged with an Extended DNS [RFC2671] option defined in 297 this document, called the SAS option, so the AAAA records can be 298 recognized as synthetic. This allows caching nameservers, dual stack 299 nodes and nodes implementing DNSSEC to ignore synthetic addresses and 300 perform an additional request for the original address records. 302 1.2.2. Walkthough 304 In this example, we consider an IPv6 node located in a IPv6-only site 305 that initiates a communication to a IPv4 node located in the IPv6 306 Internet. 308 The notation used is the following: upper case letters are IPv4 309 addresses; upper case letters with a prime(') are IPv6 addresses; 310 lower case letters are ports; prefixes are indicated by "P::X", which 311 is a IPv6 address built from an IPv4 address X by adding the prefix 312 P, mappings are indicated as "(X,x) <--> (Y',y)". 314 The scenario for this case is depicted in the following figure: 316 +---------------------------------------+ +-----------+ 317 |IPv6 site +-------------+ | | | 318 | +----+ | Name server | +-------+ | IPv4 | 319 | | H1 | | with DNS64 | | NAT64 |----| Internet | 320 | +----+ +-------------+ +-------+ +-----------+ 321 | |IP addr: Y' | | | |IP addr: X 322 | --------------------------------- | +----+ 323 +---------------------------------------+ | H2 | 324 +----+ 326 The figure shows a IPv6 node H1 which has an IPv6 address Y' and an 327 IPv4 node H2 with IPv4 address X. 329 A NAT64 connects the IPv6 network to the IPv4 Internet. This NAT64 330 has a /96 prefix (called Pref64::/96) associated to its IPv6 331 interface and an IPv4 address T assigned to its IPv4 interface. 333 Also shown is a local name server with DNS64 functionality. For the 334 purpose of this example, we assume that the name server is a dual- 335 stack node, so that H1 can contact it via IPv6, while it can contact 336 IPv4-only name servers via IPv4. 338 The local name server needs to know the /96 prefix assigned to the 339 local NAT64 (Pref64::/96). For the purpose of this example, we 340 assume it learns this through manual configuration. 342 For this example, assume the typical DNS situation where IPv6 hosts 343 have only stub resolvers and the local name server does the recursive 344 lookups. 346 The steps by which H1 establishes communication with H2 are: 348 1. H1 does a DNS lookup for the IPv6 address of H2. H1 does this by 349 sending a DNS query for an AAAA record for H2 to the local name 350 server. Assume the local name server is implementing DNS64 351 functionality. 353 2. The local DNS server resolves the query, and discovers that there 354 are no AAAA records for H2. 356 3. The name server queries for a A record for H2 and gets back an A 357 record containing the IPv4 address X. The name server then 358 synthesizes an AAAA record. The IPv6 address in the AAAA record 359 contains the prefix assigned to the NAT64 in the first 96 bits 360 and the IPv4 address X in the lower 32 bits. 362 4. The name server sends a response back to H1. If H1 has 363 indicated, in its query, that it supports the EDNS0, then the 364 name server will use the SAS option to indicate that the AAAA 365 record is synthetic. 367 5. H1 receives the synthetic AAAA record and sends a packet towards 368 H2. The packet is sent from a source transport address of (Y',y) 369 to a destination transport address of (Pref64:X,x), where y and x 370 are ports chosen by H2. 372 6. The packet is routed to the IPv6 interface of the NAT64 (since 373 Pref64::/96 has been associated to this interface). 375 7. The NAT64 receives the packet and performs the following actions: 377 * The NAT64 selects an unused port t on its IPv4 address T and 378 creates the mapping entry (Y',y) <--> (T,t) 380 * The NAT64 translates the IPv6 header into an IPv4 header using 381 SIIT. 383 * The NAT64 includes (T,t) as source transport address in the 384 packet and (X,x) as destination transport address in the 385 packet. Note that X is extracted directly from the lower 32 386 bits of the destination IPv6 address of the received IPv6 387 packet that is being translated. 389 The NAT64 sends the translated packet out its IPv4 interface and 390 the packet arrives at H2. 392 8. H2 node responds by sending a packet with destination transport 393 address (T,t) and source transport address (X,x). 395 9. The packet is routed to the NAT64 box, which will look for an 396 existing mapping containing (T,t). Since the mapping (Y',y) <--> 397 (T,t) exists, the NAT64 performs the following operations: 399 * The NAT64 translates the IPv4 header into an IPv6 header using 400 SIIT. 402 * The NAT64 includes (Y',y) as source transport address in the 403 packet and (Pref64:X,x) as destination transport address in 404 the packet. Note that X is extracted directly from the source 405 IPv4 address of the received IPv4 packet that is being 406 translated. 408 The translated packet is sent out the IPv6 interface to H2. 410 The packet exchange between H1 and H2 continues and packets are 411 translated in the different directions as previously described. 413 It is important to note that the translation still works if the IPv6 414 initiator H1 learns the IPv4 address through some scheme other than a 415 DNS look-up. This is because the DNS64 processing does NOT result in 416 any state installed in the NAT64 box and because the mapping of the 417 IPv4 address into an IPv6 address is the result of concatenating the 418 prefix defined within the site for this purpose (called Pref64::/96 419 in this document) to the original IPv4 address. 421 1.2.3. Dual stack nodes 423 Nodes that have both IPv6 and IPv4 connectivity and are configured 424 with an address for a DNS64 as their resolving nameserver may receive 425 responses containing synthetic AAAA resource records. If the node 426 prefers IPv6 over IPv4, using the addresses in the synthetic AAAA RRs 427 means that the node will attempt to communicate through the NAT64 428 mechanism first, and only fall back to native IPv4 connectivity if 429 connecting through NAT64 fails (if the application tries the full set 430 of destination addresses). To avoid this, dual stack nodes can 431 ignore all replies to DNS requests that contain the EDNS SAS option, 432 and use the destination addresses found in the responses for A 433 resource record requests instead. 435 1.2.4. IPv6 nodes implementing DNSSEC 437 Synthesizing resource records is incompatible with DNSSEC. So like 438 dual stack nodes, IPv6 nodes implementing DNSSec must not use 439 synthetic address records as indicated by the EDNS SAS option. In 440 this case, the node should perform the DNSSec validation on the 441 original A RR and then locally synthesize the AAAA RR. This 442 basically means that the DNS64 functionality should be implemented in 443 the local host for those hosts that want to be able to perform DNSSec 444 validation. In order to do that, hosts implementing DNS64 445 functionality should be able to discover Pref64::/96 prefix that is 446 needed to synthesize AAAA RR. The means used to discover the prefix 447 are out of the scope of this document. So for the purposes of 448 DNSSEC, the synthetic response doesn't exist, an IPv6 node 449 implementing DNSSEC has to request the original A resource records 450 and perform the normal DNSSEC validation steps. When this is done, 451 an IPv6 address is synthesized from the validated IPv4 address and 452 the translator /96 prefix locally. 454 1.2.5. Filtering 456 A NAT64 box may do filtering, which means that it only allows a 457 packet in through an interface if the appropriate permission exists. 458 A NAT64 may do no filtering, or it may filter on its IPv4 interface. 459 Filtering on the IPv6 interface is not supported, as mappings are 460 only created by packets traveling in the IPv6 --> IPv4 direction. 462 If a NAT64 filters on its IPv4 interface, then an incoming packet is 463 dropped unless a packet has been recently sent out the interface with 464 a destination IP address equal to the source IP address of the 465 incoming packet. 467 NAT64 filtering is consistent with the recommendations of RFC 4787. 469 2. Terminology 471 This section provides a definitive reference for all the terms used 472 in document. 474 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 475 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 476 document are to be interpreted as described in RFC 2119 [RFC2119]. 478 The following terms are used in this document: 480 DNS64: A logical function that synthesizes AAAA records (containing 481 IPv6 addresses) from A records (containing IPv4 addresses). 483 Synthetic RR: A DNS resource record (RR) that is not contained in 484 any zone data file, but has been synthesized from other RRs. An 485 example is a synthetic AAAA record created from an A record. 487 SAS Option: An Extended DNS (EDNS) option used in DNS responses. 488 Its primary purpose is to indicate that the set of AAAA RR 489 contained in a DNS response are synthetic. 491 NAT64: A device that translates IPv6 packets to IPv4 packets and 492 vice-versa, with the provision that the communication must be 493 initiated from the IPv6 side. The translation involves not only 494 the IP header, but also the transport header (TCP or UDP). 496 Session: A TCP or UDP session. In other words, the bi-directional 497 flow of packets between two ports on two different hosts. In 498 NAT64, typically one host is an IPv4 host, and the other one is an 499 IPv6 host. 501 5-Tuple: The tuple (source IP address, source port, destination IP 502 address, destination port, transport protocol). A 5-tuple 503 uniquely identifies a session. When a session flows through a 504 NAT64, each session has two different 5-tuples: one with IPv4 505 addresses and one with IPv6 addresses. 507 Session table: A table of sessions kept by a NAT64. Each NAT64 has 508 two session tables, one for TCP and one for UDP. 510 Transport Address: The combination of an IPv6 or IPv4 address and a 511 port. Typically written as (IP address, port); e.g. (192.0.2.15, 512 8001). 514 Mapping: A mapping between an IPv6 transport address and a IPv4 515 transport address. Used to translate the addresses and ports of 516 packets flowing between the IPv6 host and the IPv4 host. In 517 NAT64, the IPv4 transport address is always a transport address 518 assigned to the NAT64 itself, while the IPv6 transport address 519 belongs to some IPv6 host. 521 BIB: Binding Information Base. A table of mappings kept by a NAT64. 522 Each NAT64 has two BIBs, one for TCP and one for UDP. 524 Endpoint-Independent Mapping: In NAT64, using the same mapping for 525 all sessions between an IPv6 that have the same IPv6 transport 526 address endpoint. Endpoint-independent mapping is important for 527 peer-to-peer communication. See [RFC4787] for the definition of 528 the different types of mappings in IPv4-to-IPv4 NATs. 530 Hairpinning: Having a packet do a "U-turn" inside a NAT and come 531 back out the same interface as it arrived on. Hairpinning support 532 is important for peer-to-peer applications, as there are cases 533 when two different hosts on the same side of a NAT can only 534 communicate using sessions that hairpin though the NAT. 536 For a detailed understand of this document, the reader should also be 537 familiar with DNS terminology [RFC1035] and current NAT terminology 538 [RFC4787]. 540 3. Normative Specification 542 3.1. Synthentic AAAA RRs 544 A synthentic RR is an RR that does not appear in the master zone 545 file. 547 The rules on the usage of synthetic AAAA RRs are: 549 Synthetic AAAA RRs MAY be included in the answer section of a 550 response. 552 Synthetic AAAA RRs MUST NOT be included in sections other than the 553 answer section. 555 A synthetic AAAA RR MUST NOT be included if the responder knows of 556 at least one non-synthetic RR of the same type and class. 558 If a synthetic AAAA RR is included in the answer section, then all 559 RRs included in the answer section MUST be synthetic. 561 If a synthetic AAAA RR is _not_ explicitly marked as synthetic 562 (using the SAS option), then its TTL MUST be 0. 564 If a synthetic AAAA RR is explicitly marked as synthetic (using 565 the SAS option), then its TTL SHOULD be 0. 567 TBD: Can/should the AA bit be set in a response containing synthetic 568 RRs? 570 TBD: Do we always want synthetic RRs to have a TTL of 0? Is it ever 571 reasonable or desirable to cache them? 573 3.2. The EDNS SAS option 575 EDNS [RFC2671] defines a mechanism to add options to the DNS 576 [RFC1035] protocol. This section defines the SAS (Status of Answer 577 Section) option that indicates the status (real or synethetic) of RRs 578 in the answer section. 580 The format of the SAS option is: 581 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 582 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 583 | OPTION-CODE | 584 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 585 | OPTION-LENGTH | 586 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 587 | | 588 / OPTION-DATA / 589 | | 590 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 592 The fields are defined as follows: 594 o OPTION-CODE: (to be allocated by IANA) 596 o OPTION-LENGTH: the size (in octets) of the OPTION-DATA part of the 597 option 599 o OPTION-DATA: variable length field. No values for this field are 600 defined by this document. 602 For any OPTION-DATA defined in the future, the maximum length of the 603 OPTION-DATA field in the SAS option is 12 bytes, and any SAS option 604 with a OPTION-LENGTH of more than 8 SHOULD be silently ignored. 606 The rules on the usage of the SAS option are: 608 A requestor that understands the SAS option SHOULD include the OPT 609 RR in all queries. 611 A responder can include the SAS option in a response only if the 612 OPT RR appeared in the corresponding query. 614 Any options not understood or not meaningful in the current 615 context MUST be ignored. 617 A responder MUST include the SAS option in the response if it 618 knows that all the RRs in the answer section are synthetic. 620 The presence of the OPT RR in a query indicates that the requestor 621 understands the OPT extension. 623 3.3. DNS64 625 A DNS64 is a logical function that synthesizes AAAA records from A 626 records. The DNS64 function may be implemented in a resolver, in a 627 local recursive name server, or in some other device such as a NAT64. 629 The only configuration parameter required by the DNS64 is the /96 630 IPv6 prefix assigned to a NAT64. This prefix is used to map IPv4 631 addresses into IPv6 addresses, and is denoted Pref64::/96. The DNS64 632 learns this prefix through some means not specified here. 634 When the DNS64 receives a query for RRs of type AAAA and class IN, it 635 firsts attempts to retrieve non-synthetic RRs of this type and class 636 (where "non-synthetic RRs" means RRs not explicitly marked as 637 synthetic). If this query results in one or more AAAA records or in 638 an error condition, this result is returned to the client as per 639 normal DNS semantics. If the query is successful, but doesn't return 640 any answers, the DNS64 resolver executes a recursive A RR lookup for 641 the name in question. If this query results in an empty result or in 642 an error, this result is returned to the client. If the query 643 results in one or more A RRs, the DNS64 synthesizes AAAA RRs based on 644 the A RRs and the /96 prefix of the translator. The synthetic AAAA 645 RRs get a TTL of 0 second. The DNS64 resolver then returns the 646 synthesized AAAA records to the client. If the client included the 647 EDNS0 OPT RR in the query, the DNS64 resolver MUST include an EDNS0 648 OPT RR that contains the SAS option. When synthesizing the answer to 649 a query for ANY, the DNS64 MUST include the A records from which the 650 AAAA records were synthesized. 652 To ensure endpoint-independent mapping behavior, a given IPv6 host 653 must always use the same NAT64. This, in turn, means that any 654 synthetic AAAA records used by the host must always use the same 655 prefix. To ensure this, if a DNS64 has multiple Pref64::/96 prefixes 656 configured, it SHOULD ensure that the same prefix is used for all 657 AAAA records returned to a given host across all queries. A 658 reasonable exception would be when the DNS64 knows, through some 659 unspecified means, that the NAT64 associated with a Pref64::/96 660 prefix is no longer functional. 662 Furthermore, it is highly desirable to synthesize the AAAA records as 663 close as possible to the host that will use them. This helps ensure 664 that a given host always uses the same NAT64. 666 The DNS64 MUST obey the rules for synthetic RRs (Section 3.1) and the 667 SAS option (Section 3.2). 669 A synthetic AAAA record is created from an A record as follows: 671 o The NAME field is set to the NAME field from the A record 673 o The TYPE field is set to 28 (AAAA) 675 o The CLASS field is set to 1 (IN) 677 o The TTL field is set as described in Section 3.1 679 o The RDLENGTH field is set to 16 681 o The RDATA field is set to the IPv6 address whose upper 96 bits are 682 Pref64::/96 and whose lower 32 bits are the IPv4 address from the 683 RDATA field of the A record. 685 TBD: What does a DNS64 do when a query for an A record returns a 686 CNAME record and an A record? The SAS option, as currently defined, 687 flags ALL records in the answer section as synthetic. Does the DNS64 688 return just a CNAME record? Does it return just an AAAA record? Or 689 does it return a real CNAME record and a synthetic AAAA record in the 690 answer section -- something that the current rules do not allow. 692 3.4. NAT64 694 A NAT64 is a device with one IPv6 interface and one IPv4 interface. 695 The IPv6 interface MUST have a unicast /96 IPv6 prefix assigned to 696 it, denoted Pref64::/96. The IPv4 interface MUST have one or more 697 unicast IPv4 addresses assigned to it. 699 A NAT64 uses the following dynamic data structures: 701 o UDP BIB 703 o UDP Session Table 705 o TCP BIB 707 o TCP Session Table 709 A NAT64 has two Binding Information Bases: one for TCP and one for 710 UDP. Each BIB entry specifies a mapping between an IPv6 transport 711 address and an IPv4 transport address: 713 (X',x) <--> (T,t) 715 where X' is some IPv6 address, T is an IPv4 address, and x and t are 716 ports. T will always be one of the IPv4 addresses assigned to the 717 IPv4 interface of the NAT64. A given IPv6 or IPv4 transport address 718 can appear in at most one entry in a BIB: for example, (2001:db8::17, 719 4) can appear in at most one TCP and at most one UDP BIB entry. TCP 720 and UDP have separate BIBs because the port number space for TCP and 721 UDP are distinct. 723 A NAT64 also has two session tables: one for TCP sessions and one for 724 UDP sessions. Each entry keeps information on the state of the 725 corresponding session: see Section 3.4.2. The NAT64 uses the session 726 state information to determine when the session is completed, and 727 also uses session information for ingress filtering. A session can 728 be uniquely identified by either an incoming 5-tuple or an outgoing 729 5-tuple. 731 For each session, there is a corresponding BIB entry, uniquely 732 specified by either the source IPv6 transport address (in the IPv6 733 --> IPv4 direction) or the destination IPv4 transport address (in the 734 IPv4 --> IPv6 direction). However, a single BIB entry can have 735 multiple corresponding sessions. When the last corresponding session 736 is deleted, the BIB entry is deleted. 738 The processing of an incoming IP packet takes the following steps: 740 1. Determining the incoming 5-tuple 742 2. Filtering and updating session information 744 3. Computing the outgoing 5-tuple 746 4. Translating the packet 748 5. Handling hairpinning 750 The details of these steps are specified in the following 751 subsections. 753 This breakdown of the NAT64 behavior into processing steps is done 754 for ease of presentation. A NAT64 MAY perform the steps in a 755 different order, or MAY perform different steps, as long as the 756 externally visible outcome in the same. 758 TBD: Add support for ICMP Query packets. (ICMP Error packets are 759 handled). 761 3.4.1. Determining the Incoming 5-tuple 763 This step associates a incoming 5-tuple (source IP address, source 764 port, destination IP address, destination port, transport protocol) 765 with every incoming IP packet for use in subsequent steps. 767 If the incoming IP packet contains a complete (un-fragmented) UDP or 768 TCP protocol packet, then the 5-tuple is computed by extracting the 769 appropriate fields from the packet. 771 If the incoming IP packet contains a complete (un-fragmented) ICMP 772 message, then the 5-tuple is computed by extracting the appropriate 773 fields from the IP packet embedded inside the ICMP message. However, 774 the role of source and destination is swapped when doing this: the 775 embedded source IP address becomes the destination IP address in the 776 5-tuple, the embedded source port becomes the destination port in the 777 5-tuple, etc. If it is not possible to determine the 5-tuple 778 (perhaps because not enough of the embedded packet is reproduced 779 inside the ICMP message), then the incoming IP packet is silently 780 discarded. 782 NOTE: The transport protocol is always one of TCP or UDP, even if 783 the IP packet contains an ICMP message. 785 If the incoming IP packet contains a fragment, then more processing 786 may be needed. This specification leaves open the exact details of 787 how a NAT64 handles incoming IP packets containing fragments, and 788 simply requires that a NAT64 handle fragments arriving out-of-order. 789 A NAT64 MAY elect to queue the fragments as they arrive and translate 790 all fragments at the same time. Alternatively, a NAT64 MAY translate 791 the fragments as they arrive, by storing information that allows it 792 to compute the 5-tuple for fragments other than the first. In the 793 latter case, the NAT64 will still need to handle the situation where 794 subsequent fragments arrive before the first. 796 Implementors of NAT64 should be aware that there are a number of 797 well-known attacks against IP fragmentation; see [RFC1858] and 798 [RFC3128]. 800 Assuming it otherwise has sufficient resources, a NAT64 MUST allow 801 the fragments to arrive over a time interval of at least 10 seconds. 802 A NAT64 MAY require that the UDP, TCP, or ICMP header be completely 803 contained within the first fragment. 805 3.4.2. Filtering and Updating Session Information 807 This step updates the per-session information stored in the 808 appropriate session table. This affects the lifetime of the session, 809 which in turn affects the lifetime of the corresponding BIB entry. 810 This step may also filter incoming packets, if desired. 812 The details of this step depend on the transport protocol (UDP or 813 TCP). 815 3.4.2.1. UDP Session Handling 817 The state information stored for a UDP session is a timer that tracks 818 the remaining lifetime of the UDP session. The NAT64 decrements this 819 timer at regular intervals. When the timer expires, the UDP session 820 is deleted. 822 The incoming packet is processed as follows: 824 1. If the packet arrived on the IPv4 interface and the NAT64 filters 825 on its IPv4 interface, then the NAT64 checks to see if the 826 incoming packet is allowed according to the address-dependent 827 filtering rule. To do this, it searches for a session table 828 entry with a source IPv4 address equal to the source IPv4 address 829 in the incoming 5-tuple. If such an entry is found (there may be 830 more than one), packet processing continues. Otherwise, the 831 packet is discarded. If the packet is discarded, then an ICMP 832 message SHOULD be sent to the original sender of the packet, 833 unless the discarded packet is itself an ICMP message. The ICMP 834 message, if sent, has a type of 3 (Destination Unreachable) and a 835 code of 13 (Communication Administratively Prohibited). 837 2. The NAT64 searches for the session table entry corresponding to 838 the incoming 5-tuple. If no such entry if found, a new entry is 839 created. 841 3. The NAT64 sets or resets the timer in the session table entry to 842 maximum session lifetime. By default, the maximum session 843 lifetime is 5 minutes, but for specific destination ports in the 844 Well-Known port range (0..1023), the NAT64 MAY use a smaller 845 maximum lifetime. 847 3.4.2.2. TCP Session Handling 849 TBD: Describe the state machine required to track the state of the 850 TCP session. This is a simplified version of the state machine used 851 by the endpoints. 853 3.4.3. Computing the Outgoing 5-Tuple 855 This step computes the outgoing 5-tuple by translating the addresses 856 and ports in the incoming 5-tuple. The transport protocol in the 857 outgoing 5-tuple is always the same as that in the incoming 5-tuple. 859 In the text below, a reference to the "the BIB" means either the TCP 860 BIB or the UDP BIB as appropriate, as determined by the transport 861 protocol in the 5-tuple. 863 NOTE: Not all addresses are translated using the BIB. BIB entries 864 are used to translate IPv6 source transport addresses to IPv4 865 source transport addresses, and IPv4 destination transport 866 addresses to IPv6 destination transport addresses. They are NOT 867 used to translate IPv6 destination transport addresses to IPv4 868 destination transport addresses, nor to translate IPv4 source 869 transport addresses to IPv6 source transport addresses. The 870 latter cases are handled by adding or removing the /96 prefix. 871 This distinction is important; without it, hairpinning doesn't 872 work correctly. 874 When translating in the IPv6 --> IPv4 direction, let the incoming 875 source and destination transport addresses in the 5-tuple be (S',s) 876 and (D',d) respectively. The outgoing source transport address is 877 computed as follows: 879 If the BIB contains a entry (S',s) <--> (T,t), then the outgoing 880 source transport address is (T,t). 882 Otherwise, create a new BIB entry (S',s) <--> (T,t) as described 883 below. The outgoing source transport address is (T,t). 885 The outgoing destination address is computed as follows: 887 If D' is composed of the NAT64's prefix followed by an IPv4 888 address D, then the outgoing destination transport address is 889 (D,d). 891 Otherwise, discard the packet. 893 When translating in the IPv4 --> IPv6 direction, let the incoming 894 source and destination transport addresses in the 5-tuple be (S,s) 895 and (D,d) respectively. The outgoing source transport address is 896 computed as follows: 898 The outgoing source transport address is (Pref64::S,s). 900 The outgoing destination transport address is computed as follows: 902 If the BIB contains an entry (X',x) <--> (D,d), then the outgoing 903 destination transport address is (X',x). 905 Otherwise, discard the packet. 907 If the rules specify that a new BIB entry is created for a source 908 transport address of (S',s), then the NAT64 allocates an IPv4 909 transport address for this BIB entry as follows: 911 If there exists some other BIB entry containing S' as the IPv6 912 address and mapping it to some IPv4 address T, then use T as the 913 IPv4 address. Otherwise, use any IPv4 address assigned to the 914 IPv4 interface. 916 If the port s is in the Well-Known port range 0..1023, then 917 allocate a port t from this same range. Otherwise, if the port s 918 is in the range 1024..65535, then allocate a port t from this 919 range. Furthermore, if port s is even, then t must be even, and 920 if port s is odd, then t must be odd. 922 In all cases, the allocated IPv4 transport address (T,t) MUST NOT 923 be in use in another entry in the same BIB, but MAY be in use in 924 the other BIB. 926 If it is not possible to allocate an appropriate IPv4 transport 927 address or create a BIB entry for some reason, then the packet is 928 discarded. 930 TBD: Do we delete the session entry if we cannot create a BIB entry? 932 If the rules specify that the packet is discarded, then the NAT64 933 SHOULD send an ICMP reply to the original sender, unless the packet 934 being translated contains an ICMP message. The type should be 3 935 (Destination Unreachable) and the code should be 0 (Network 936 Unreachable in IPv4, and No Route to Destination in IPv6). 938 3.4.4. Translating the Packet 940 This step translates the packet from IPv6 to IPv4 or vica-versa. 942 The translation of the packet is as specified in section 3 and 943 section 4 of SIIT [RFC2765], with the following modifications: 945 o When translating an IP header (sections 3.1 and 4.1), the source 946 and destination IP address fields are set to the source and 947 destination IP addresses from the outgoing 5-tuple. 949 o When the protocol following the IP header is TCP or UDP, then the 950 source and destination ports are modified to the source and 951 destination ports from the 5-tuple. In addition, the TCP or UDP 952 checksum must also be updated to reflect the translated addresses 953 and ports; note that the TCP and UDP checksum covers the pseudo- 954 header which contains the source and destination IP addresses. An 955 algorithm for efficently updating these checksums is described in 956 [RFC3022]. 958 o When the protocol following the IP header is ICMP (sections 3.4 959 and 4.4) the source and destination transport addresses in the 960 embedded packet are set to the destination and source transport 961 addresses from the outgoing 5-tuple (note the swap of source and 962 destination). 964 3.4.5. Handling Hairpinning 966 This step handles hairpinning if necessary. 968 If the destination IP address is an address assigned to the NAT64 969 itself (i.e., is one of the IPv4 addresses assigned to the IPv4 970 interface, or is covered by the /96 prefix assigned to the IPv6 971 interface), then the packet is a hairpin packet. The outgoing 972 5-tuple becomes the incoming 5-tuple, and the packet is treated as if 973 it was received on the outgoing interface. Processing of the packet 974 continues at step 2. 976 TBD: Is there such a thing as a hairpin loop (likely not naturally, 977 but perhaps through a special-crafted attack packet with a spoofed 978 source address)? If so, need to drop packets that hairpin more than 979 once. 981 3.5. FTP ALG 983 TBD: Describe the FTP ALG, a mechanism for translating the embedded 984 IP addresses inside FTP commands, that enables FTP sessions to pass 985 through NAT64. 987 4. Application scenarios 989 In this section, we describe how to apply NAT64/DNS64 to the suitable 990 scenarios described in draft-arkko-townsley-coexistence. 992 4.1. Enterprise IPv6 only network 994 The Enterprise IPv6 only network basically has IPv6 hosts (those that 995 are currently available) and because of different reasons including 996 operational simplicity, wants to run those hosts in IPv6 only mode, 997 while still providing access to the IPv4 Internet. The scenario is 998 depicted in the picture below. 1000 +----+ +-------------+ 1001 | +------------------+IPv6 Internet+ 1002 | | +-------------+ 1003 IPv6 host-----------------+ GW | 1004 | | +-------------+ 1005 | +------------------+IPv4 Internet+ 1006 +----+ +-------------+ 1008 |-------------------------public v6-----------------------------| 1009 |-------public v6---------|NAT|----------public v4--------------| 1011 The proposed NAT64/DNS64 is perfectly suitable for this particular 1012 scenario. The deployment of the NAT64/DNS64 would be as follows: The 1013 NAT64 function should be located in the GW device that connects the 1014 IPv6 site to the IPv4 Internet. The DNS64 functionality can be 1015 placed in different places. Probably the best trade-off between 1016 architectural cleanness deployment simplicity would be to place it in 1017 the local recursive DNS server of the enterprise site. The option 1018 that is easier to deploy would be to co-locate it with the NAT64 box. 1019 The cleanest option would be included in the local resolver of the 1020 IPv6 hosts, but this option seems the harder to deploy cause it 1021 implies changes to the hosts. 1023 The proposed NAT64/DNS64 approach satisfies the requirements of this 1024 scenario, in particular cause it doesn't require any changes to 1025 current IPv6 hosts in the site to obtain basic functionality. 1027 4.2. Reaching servers in private IPv4 space 1029 The scenario of servers using IPv4 private addresses and being 1030 reached from the IPv6 Internet basically includes the cases that for 1031 whatever reason the servers cannot be upgraded to IPv6 and they don't 1032 have public VIPv4 addresses and it would be useful to allow IPv6 1033 nodes in the IPv6 Internet to reach those servers. This scenario is 1034 depicted in the figure below. 1036 +----+ 1037 IPv6 Host(s)-------(Internet)-----+ GW +------Private IPv4 Servers 1038 +----+ 1040 |---------public v6---------------|NAT|------private v4----------| 1042 This scenario can again be perfectly served by the NAT64 approach. 1043 In this case the NAT64 functionality is placed in the GW device 1044 connecting the IPv6 Internet to the server's site. In this case, the 1045 DNS64 functionality is not needed. Since the server's site is 1046 running the NAT64 and the servers, it can publish in its own DNS 1047 server the AAAA RR corresponding to the servers i.e. AAAA RR 1048 associating the FQDN of the server and the Pref64:ServerIPv4Addr. In 1049 this case, there is no need to synthesize AAAA RR cause the site can 1050 configure them in the DNS itself. 1052 Again, this scenario is satisfied by the NAT64 since it supports the 1053 required functionality without requiring changes in the IPv4 servers 1054 nor in the IPv6 clients. 1056 5. Discussion 1058 5.1. About the Prefix used to map the IPv4 address space into IPv6 1060 In the NAT64 approach, we need to represent the IPv4 addresses in the 1061 IPv6 Internet. Since there is enough address space in IPv6, we can 1062 easily embed the IPv4 address into an IPv6 address, so that the IPv4 1063 address information can be extracted from the IPv6 address without 1064 requiring additional state. One way to that is to use an IPv6 prefix 1065 Pref64::/96 and juxtapose the IPv4 address at the end (there are 1066 other ways of doing it, but we are not discussing the different 1067 formats here). In this document the Pref64::/96 prefix is extracted 1068 from the address block assigned to the site running the NAT64 box. 1069 However, one could envision the usage of other prefixes for that 1070 function. In particular, it would be possible to define a well-known 1071 prefix that can be used by the NAT64 devices to map IPv4 (public) 1072 addresses into IPv6 addresses, irrespectively of the address space of 1073 the site where the NAT64 is located. In this section, we discuss the 1074 pro and cons of the different options. 1076 the different options for Pref64::/96 are the following 1078 Local: A locally assigned prefix out of the address block of the 1079 site running the NAT64 box 1081 Well-known: A well know prefix that is reserved for this purpose. 1082 We have the following different options: 1084 IPv4 mapped prefix 1086 IPv4 compatible prefix 1088 A new prefix assigned by IANA for this purpose 1090 The reasons why using a well-known prefix is attractive are the 1091 following: Having a global well-know prefix would allow to identify 1092 which addresses are "real" IPv6 addresses with native connectivity 1093 and which addresses are IPv6 addresses that represent an IPv4 1094 address. From an architectural perspective, it seems the right thing 1095 to do to make this visible since hosts an applications could react 1096 accordingly and avoid or prefer such type of connectivity if needed. 1097 From the DNS64 perspective, using the well-know prefix would imply 1098 that the same synthetic AAAA RR will be created throughout the IPv6 1099 Internet, which would result in consistent view of the RR 1100 irrespectively of the location in the topology. From a more 1101 practical perspective, having a well-know prefix would allow to 1102 completely decouple the DNS64 from the NAT64, since the DNS64 would 1103 always use the well-know prefix to create the synthetic AAAA RR and 1104 there is no need to configure the same Pref64::/96 both in the DNS64 1105 and the NAT64 that work together. 1107 Among the different options available for the well-know prefix, the 1108 option of using a pre-existing prefix such as the IPv4-mapped or 1109 IPv4-compatible prefix has the advantage that would potentially allow 1110 the default selection of native connectivity over translated 1111 connectivity for legacy hosts in communications involving dual-stack 1112 hosts. This is because current RFC3484 default policy table include 1113 entries for the IPv4-mapped prefix and the IPv4-compatible prefix, 1114 implying that native IPv6 prefixes will be preferred over these. 1115 However, current implementations do not use the IPv4-mapped prefix on 1116 the wire, beating the purpose of support unmodified hosts. The IPv4- 1117 compatible prefix is used by hosts on the wire, but has a higher 1118 priority than the IPv4-mapped prefix, which implies that current 1119 hosts would prefer translated connectivity over native IPv4 1120 connectivity (represented by the IPv4-mapped prefix in the default 1121 policy table). So neither of the prefixes that are present in the 1122 default policy table would result in the legacy hosts preferring 1123 native connectivity over translated connectivity, so it doesn't seem 1124 to be a compelling reason to re-use neither the IPv4-mapped not the 1125 IPv4-compatible prefix for this. So, we conclude that among the the 1126 well know prefix options, the preferred option would be to ask for a 1127 new prefix from IANA to be allocated for this. 1129 However, there are several issues when considering using the well- 1130 know prefix option, namely: 1132 The well-know prefix is suitable only for mapping IPv4 public 1133 addresses into IPv6. IPv4 public addresses can be mapped using 1134 the same prefix cause they are globally unique. However, the 1135 well-known prefix is not suitable for mapping IPv4 private 1136 addresses. This is so because we cannot leverage on the 1137 uniqueness of the IPv4 address to achieve uniqueness of the IPv6 1138 address, so we need to use a different IPv6 prefix to disambiguate 1139 the different private IPv4 address realms. As we describe above, 1140 there is a clear use case for mapping IPv4 private addresses, so 1141 there is a pressing need to map IPv4 private addresses. In order 1142 to do so we will need to use at least for IPv4 private addresses, 1143 IPv6 local prefixes. In that case, the architectural goal of 1144 distinguishing the "real" IPv6 addresses from the IPv6 addresses 1145 that represent IPv4 addresses can no longer be achieved in a 1146 general manner, making this option less attractive. 1148 The usage of a single well-known prefix to map IPv4 addresses 1149 irrespectively of the NAT64 used, may results in failure modes in 1150 sites that have more than one NAT64 device. The main problem is 1151 that intra-site routing fluctuations that result in packets of an 1152 ongoing communication flow through a different NAT64 box that the 1153 one they were initially using (e.g. a change in an ECMP load 1154 balancer), would break ongoing communications. This is so because 1155 the different NAT64 boxes will use a different IPv4 address, so 1156 the IPv4 peer of the communications will receive packets coming 1157 from a different IPv4 address. This is avoided using a local 1158 address, since each NAT64 box can have a different Pref64::/06 1159 associated, to routing fluctuations would not result in using a 1160 different NAT64 box. 1162 The usage of a well-known prefix is also problematic in the case 1163 that different routing domains want to exchange routing 1164 information involving these routes. Consider the case of an IPv6 1165 site that has multiple providers and that each of these providers 1166 provides access to the IPv4 Internet using the well know prefix. 1167 Consider the hypothetical case that different parts of the IPv4 1168 Internet are reachable through different IPv6 ISPs (yes, this 1169 means that in a futuristic scenario, the IPv4 Internet is 1170 partitioned). In order to reach the different parts through the 1171 different ISPs, more specific routes representing the different 1172 IPv4 destinations reachable need to be injected in the IPv6 sites. 1173 This basically means that such configuration would imply to import 1174 the IPv4 routing entropy into the IPv6 routing system. If 1175 different local prefixes are used, then each ISP only announces 1176 its own local prefix, and then the burden of defining which IPv4 1177 destination is reachable through which ISP is placed somewhere 1178 else (e.g. in the DNS64). 1180 6. Security Considerations 1182 Implications on end-to-end security, IPSec and TLS. 1184 Any protocol that protect IP header information are essentially 1185 incompatible with NAT64. So, this implies that end to end IPSec 1186 verification will fail when AH is used (both transport and tunnel 1187 mode) and when ESP is used in transport mode. This is inherent to 1188 any network layer translation mechanism. End-to-end IPsec protection 1189 can be restored, using UDP encapsulation as described in [RFC2765]. 1191 TBD: TLS implications 1193 Implications on DNS security and DNSSec. 1195 NAT64 uses synthetic DNS RR to enable IPv6 clients to initiate 1196 communications with IPv4 servers using the DNS. This essentially 1197 means that the DNS64 component generates synthetic AAAA RR that are 1198 not contained in the master zone file. From a DNSSec perspective, 1199 this means that the straight DNSSec verification of such RR would 1200 fail. However, it is possible to restore DNSSec functionality if the 1201 verification is performed right before the DNS64 processing directly 1202 using the original A RR of the IPv4 server. So, in order to jointly 1203 use the NAT64 appraoch described in thei specification and DNSSec 1204 validation, the DNS64 functionality should be performed in the 1205 resolver of the IPv6 client. In this case, the IPv6 client would 1206 receive the original A RR with DNSSec information and it would first 1207 perform the DNSSec validation. If it is succcessful, it would then 1208 proceed the synthetize the AAAA RR according to the mechanism 1209 described in this document. It should be noted that the synthetic 1210 AAAA RR would stay within the IPv6 client and it would not leak 1211 outside, making further DNSSec validations unnecesary. 1213 Filtering. 1215 NAT64 creates binding state using packets flowing from the IPv6 side 1216 to the IPv4 side. So, NAT64 implements by definition, at least, 1217 endpoint independent filtering, meaning that in order to enable any 1218 packet to flow from the IPv4 side to the IPv6 side, there must have 1219 been a packet flowing from the IPv6 side to the IPv4 side the created 1220 the binding information to be used for packets in the other 1221 direction. Endpoint independent filtering allows that once a binding 1222 is created, it can be used by any node on the IPv4 side to send 1223 packets to the IPv6 transport address that created the binding. This 1224 basically means that as long a the IPv6 node does not open a hole in 1225 the NAT64, incoming communications are blocked and that once that the 1226 IPv6 node has sent the first packet, this packet opens the door for 1227 any node on the IPv4 side to send packets to that IPv6 transport 1228 address. It is possible to configure the NAT64 to implement more 1229 stringent security policy, if endpoint independent mapping is 1230 considered not secure enough. In particular, if the security policy 1231 of the NAT64 requires it, is it possible to configure the NAT64 to 1232 perform address dependent filtering. This basically means that the 1233 binding state created can only be used by to send packets from the 1234 IPv4 address to which the original packet that created the binding 1235 was sent to. This basically means that the door is open only for 1236 that IPv4 address to send packet to the IPv6 transport address. 1238 Attacks to NAT64. 1240 The NAT64 device itself is a potential victim of different type of 1241 attacks. In particular, the NAT64 can be a victim of DoS attacks. 1242 The NAT64 box has a limited number of resources that can be consumed 1243 by attackers creating a DoS attack. The NAT64 has a limited number 1244 of IPv4 address that is uses to create the bindings. Even though the 1245 NAT64 performs address and port translation, it is possible for an 1246 attacker to consume all the IPv4 transport addresses by sending IPv6 1247 packets with different source IPv6 transport address. It should be 1248 noted that this attack can only be launched from the IPv6 side, since 1249 IPv4 packets are not used to create binding state. DoS attacks can 1250 also affect other limited resource available in the NAT64 such as 1251 memory or link capacity. For instance, if the NAT64 implements 1252 reassembly of fragmented packets, it is possible for an attacker to 1253 launch a DoS attack to the memory of the NAT64 device by sending 1254 fragments that the NAT64 will store for a given period. If the 1255 number of fragments if high enough, the memory of the NAT64 could be 1256 exhausted. NAT64 devices should implement proper protection against 1257 such attacks, for instance allocating a limited amount of memory for 1258 fragmented packet storage. 1260 7. IANA Considerations 1262 The IANA is requested to assign an EDNS Option Code value for the SAS 1263 option. 1265 TBD: Set up an IANA registry for SAS flags?? 1267 8. Changes from Previous Draft Versions 1269 Note to RFC Editor: Please remove this section prior to publication 1270 of this document as an RFC. 1272 [[This section lists the changes between the various versions of this 1273 draft.]] 1275 9. Contributors 1277 George Tsirtsis 1278 Qualcomm 1280 tsirtsis@googlemail.com 1282 10. Acknowledgements 1284 Dave Thaler, Dan Wing, Alberto Garcia-Martinez and Joao Damas 1285 reviewed the document and provided useful comments to improve it. 1287 The content of the draft was improved thanks to discussions with Fred 1288 Baker and Jari Arkko. 1290 Marcelo Bagnulo and Iljitsch van Beijnum are partly funded by 1291 Trilogy, a research project supported by the European Commission 1292 under its Seventh Framework Program. 1294 11. References 1296 11.1. Normative References 1298 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1299 Requirement Levels", BCP 14, RFC 2119, March 1997. 1301 [RFC1035] Mockapetris, P., "Domain names - implementation and 1302 specification", STD 13, RFC 1035, November 1987. 1304 [RFC2671] Vixie, P., "Extension Mechanisms for DNS (EDNS0)", 1305 RFC 2671, August 1999. 1307 [RFC2765] Nordmark, E., "Stateless IP/ICMP Translation Algorithm 1308 (SIIT)", RFC 2765, February 2000. 1310 [RFC4787] Audet, F. and C. Jennings, "Network Address Translation 1311 (NAT) Behavioral Requirements for Unicast UDP", BCP 127, 1312 RFC 4787, January 2007. 1314 [I-D.ietf-behave-tcp] 1315 Guha, S., Biswas, K., Ford, B., Sivakumar, S., and P. 1316 Srisuresh, "NAT Behavioral Requirements for TCP", 1317 draft-ietf-behave-tcp-08 (work in progress), 1318 September 2008. 1320 [I-D.ietf-behave-nat-icmp] 1321 Srisuresh, P., Ford, B., Sivakumar, S., and S. Guha, "NAT 1322 Behavioral Requirements for ICMP protocol", 1323 draft-ietf-behave-nat-icmp-08 (work in progress), 1324 June 2008. 1326 11.2. Informative References 1328 [RFC2766] Tsirtsis, G. and P. Srisuresh, "Network Address 1329 Translation - Protocol Translation (NAT-PT)", RFC 2766, 1330 February 2000. 1332 [RFC1858] Ziemba, G., Reed, D., and P. Traina, "Security 1333 Considerations for IP Fragment Filtering", RFC 1858, 1334 October 1995. 1336 [RFC3128] Miller, I., "Protection Against a Variant of the Tiny 1337 Fragment Attack (RFC 1858)", RFC 3128, June 2001. 1339 [RFC3022] Srisuresh, P. and K. Egevang, "Traditional IP Network 1340 Address Translator (Traditional NAT)", RFC 3022, 1341 January 2001. 1343 [RFC4966] Aoun, C. and E. Davies, "Reasons to Move the Network 1344 Address Translator - Protocol Translator (NAT-PT) to 1345 Historic Status", RFC 4966, July 2007. 1347 [I-D.ietf-mmusic-ice] 1348 Rosenberg, J., "Interactive Connectivity Establishment 1349 (ICE): A Protocol for Network Address Translator (NAT) 1350 Traversal for Offer/Answer Protocols", 1351 draft-ietf-mmusic-ice-19 (work in progress), October 2007. 1353 [RFC3498] Kuhfeld, J., Johnson, J., and M. Thatcher, "Definitions of 1354 Managed Objects for Synchronous Optical Network (SONET) 1355 Linear Automatic Protection Switching (APS) 1356 Architectures", RFC 3498, March 2003. 1358 Authors' Addresses 1360 Marcelo Bagnulo 1361 UC3M 1362 Av. Universidad 30 1363 Leganes, Madrid 28911 1364 Spain 1366 Phone: +34-91-6249500 1367 Fax: 1368 Email: marcelo@it.uc3m.es 1369 URI: http://www.it.uc3m.es/marcelo 1370 Philip Matthews 1371 Unaffiliated 1373 Email: philip_matthews@magma.ca 1374 URI: 1376 Iljitsch van Beijnum 1377 IMDEA Networks 1378 Av. Universidad 30 1379 Leganes, Madrid 28911 1380 Spain 1382 Phone: +34-91-6246245 1383 Email: iljitsch@muada.com 1385 Full Copyright Statement 1387 Copyright (C) The IETF Trust (2008). 1389 This document is subject to the rights, licenses and restrictions 1390 contained in BCP 78, and except as set forth therein, the authors 1391 retain all their rights. 1393 This document and the information contained herein are provided on an 1394 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1395 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 1396 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 1397 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 1398 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1399 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1401 Intellectual Property 1403 The IETF takes no position regarding the validity or scope of any 1404 Intellectual Property Rights or other rights that might be claimed to 1405 pertain to the implementation or use of the technology described in 1406 this document or the extent to which any license under such rights 1407 might or might not be available; nor does it represent that it has 1408 made any independent effort to identify any such rights. Information 1409 on the procedures with respect to rights in RFC documents can be 1410 found in BCP 78 and BCP 79. 1412 Copies of IPR disclosures made to the IETF Secretariat and any 1413 assurances of licenses to be made available, or the result of an 1414 attempt made to obtain a general license or permission for the use of 1415 such proprietary rights by implementers or users of this 1416 specification can be obtained from the IETF on-line IPR repository at 1417 http://www.ietf.org/ipr. 1419 The IETF invites any interested party to bring to its attention any 1420 copyrights, patents or patent applications, or other proprietary 1421 rights that may cover technology that may be required to implement 1422 this standard. Please address the information to the IETF at 1423 ietf-ipr@ietf.org.