idnits 2.17.1 

draft-bagnulo-behave-nat64-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 19.

  -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on
     line 1399.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1410.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1417.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1423.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust Copyright Line does not match the
     current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (September 19, 2008) is 5691 days in the past.  Is
     this intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: 'I-D.ietf-mmusic-ice' is defined on line 1347, but no
     explicit reference was found in the text

  == Unused Reference: 'RFC3498' is defined on line 1353, but no explicit
     reference was found in the text

  ** Obsolete normative reference: RFC 2671 (Obsoleted by RFC 6891)

  ** Obsolete normative reference: RFC 2765 (Obsoleted by RFC 6145)

  == Outdated reference: A later version (-12) exists of
     draft-ietf-behave-nat-icmp-08

  -- Obsolete informational reference (is this intentional?): RFC 2766
     (Obsoleted by RFC 4966)


     Summary: 3 errors (**), 0 flaws (~~), 4 warnings (==), 8 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	BEHAVE WG                                                     M. Bagnulo
3	Internet-Draft                                                      UC3M
4	Intended status: Standards Track                             P. Matthews
5	Expires: March 23, 2009                                     Unaffiliated
6	                                                          I. van Beijnum
7	                                                          IMDEA Networks
8	                                                      September 19, 2008

10	NAT64/DNS64: Network Address and Protocol Translation from IPv6 Clients
11	                            to IPv4 Servers
12	                     draft-bagnulo-behave-nat64-01

14	Status of this Memo

16	   By submitting this Internet-Draft, each author represents that any
17	   applicable patent or other IPR claims of which he or she is aware
18	   have been or will be disclosed, and any of which he or she becomes
19	   aware will be disclosed, in accordance with Section 6 of BCP 79.

21	   Internet-Drafts are working documents of the Internet Engineering
22	   Task Force (IETF), its areas, and its working groups.  Note that
23	   other groups may also distribute working documents as Internet-
24	   Drafts.

26	   Internet-Drafts are draft documents valid for a maximum of six months
27	   and may be updated, replaced, or obsoleted by other documents at any
28	   time.  It is inappropriate to use Internet-Drafts as reference
29	   material or to cite them other than as "work in progress."

31	   The list of current Internet-Drafts can be accessed at
32	   http://www.ietf.org/ietf/1id-abstracts.txt.

34	   The list of Internet-Draft Shadow Directories can be accessed at
35	   http://www.ietf.org/shadow.html.

37	   This Internet-Draft will expire on March 23, 2009.

39	Abstract

41	   NAT64 is a mechanism for translating IPv6 packets to IPv4 packets and
42	   vice-versa.  DNS64 is a mechanism for synthesizing AAAA records from
43	   A records.  These two mechanisms together enable client-server
44	   communication between an IPv6-only client and an IPv4-only server,
45	   without requiring any changes to either the IPv6 or the IPv4 node,
46	   for the class of applications that work through NATs.  They also
47	   enable peer-to-peer communication between an IPv4 and an IPv6 node,
48	   where the communication can be initiated by either end using
49	   existing, NAT-traversing, peer-to-peer communication techniques.
50	   This document specifies NAT64 and DNS64, and gives suggestions on how
51	   they should be deployed.

53	Table of Contents

55	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
56	     1.1.  Features of NAT64  . . . . . . . . . . . . . . . . . . . .  3
57	     1.2.  Overview . . . . . . . . . . . . . . . . . . . . . . . . .  4
58	       1.2.1.  NAT64 solution elements  . . . . . . . . . . . . . . .  5
59	       1.2.2.  Walkthough . . . . . . . . . . . . . . . . . . . . . .  7
60	       1.2.3.  Dual stack nodes . . . . . . . . . . . . . . . . . . .  9
61	       1.2.4.  IPv6 nodes implementing DNSSEC . . . . . . . . . . . . 10
62	       1.2.5.  Filtering  . . . . . . . . . . . . . . . . . . . . . . 10
63	   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . . 10
64	   3.  Normative Specification  . . . . . . . . . . . . . . . . . . . 12
65	     3.1.  Synthentic AAAA RRs  . . . . . . . . . . . . . . . . . . . 12
66	     3.2.  The EDNS SAS option  . . . . . . . . . . . . . . . . . . . 13
67	     3.3.  DNS64  . . . . . . . . . . . . . . . . . . . . . . . . . . 14
68	     3.4.  NAT64  . . . . . . . . . . . . . . . . . . . . . . . . . . 15
69	       3.4.1.  Determining the Incoming 5-tuple . . . . . . . . . . . 17
70	       3.4.2.  Filtering and Updating Session Information . . . . . . 17
71	         3.4.2.1.  UDP Session Handling . . . . . . . . . . . . . . . 18
72	         3.4.2.2.  TCP Session Handling . . . . . . . . . . . . . . . 18
73	       3.4.3.  Computing the Outgoing 5-Tuple . . . . . . . . . . . . 18
74	       3.4.4.  Translating the Packet . . . . . . . . . . . . . . . . 20
75	       3.4.5.  Handling Hairpinning . . . . . . . . . . . . . . . . . 21
76	     3.5.  FTP ALG  . . . . . . . . . . . . . . . . . . . . . . . . . 21
77	   4.  Application scenarios  . . . . . . . . . . . . . . . . . . . . 21
78	     4.1.  Enterprise IPv6 only network . . . . . . . . . . . . . . . 21
79	     4.2.  Reaching servers in private IPv4 space . . . . . . . . . . 22
80	   5.  Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 23
81	     5.1.  About the Prefix used to map the IPv4 address space
82	           into IPv6  . . . . . . . . . . . . . . . . . . . . . . . . 23
83	   6.  Security Considerations  . . . . . . . . . . . . . . . . . . . 25
84	   7.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 27
85	   8.  Changes from Previous Draft Versions . . . . . . . . . . . . . 27
86	   9.  Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 27
87	   10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 28
88	   11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 28
89	     11.1. Normative References . . . . . . . . . . . . . . . . . . . 28
90	     11.2. Informative References . . . . . . . . . . . . . . . . . . 29
91	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 29
92	   Intellectual Property and Copyright Statements . . . . . . . . . . 31

94	1.  Introduction

96	   This document specifies NAT64 and DNS64, two mechanisms for IPv6-IPv4
97	   transition and co-existence.  Together, these two mechanisms allow a
98	   IPv6-only client to initiate communications to an IPv4-only server,
99	   and also allow peer-to-peer communication between IPv6-only and IPv4-
100	   only hosts.

102	   NAT64 is a mechanism for translating IPv6 packets to IPv4 packets.
103	   The translation is done by translating the packet headers according
104	   to SIIT [RFC2765], translating the IPv4 server address by adding or
105	   removing a /96 prefix, and translating the IPv6 client address by
106	   installing mappings in the normal NAT manner.

108	   DNS64 is a mechanism for synthesizing AAAA resource records (RR) from
109	   A RR.  The synthesis is done by adding a /96 prefix to the IPv4
110	   address to create an IPv6 address, where the /96 prefix is assigned
111	   to a NAT64 device.

113	   Together, these two mechanisms allow a IPv6-only client to initiate
114	   communications to an IPv4-only server.

116	   These mechanisms are expected to play a critical role in the IPv4-
117	   IPv6 transition and co-existence.  Due to IPv4 address depletion,
118	   it's likely that in the future, a lot of IPv6-only clients will want
119	   to connect to IPv4-only servers.  The NAT64 and DNS64 mechanisms are
120	   easily deployable, since they require no changes to either the IPv6
121	   client nor the IPv6 server.  For basic functionality, the approach
122	   only requires the deployment of NAT64-enabled devices connecting an
123	   IPv6-only network to the IPv4-only Internet, along with the
124	   deployment of a few DNS64-enabled name servers in the IPv6-only
125	   network.  However, some advanced features require software updates to
126	   the IPv6-only hosts.

128	   The NAT64 and DNS64 mechanisms are related to the NAT-PT mechanism
129	   defined in [RFC2766], but significant differences exist.  First,
130	   NAT64 does not define the NATPT mechanisms used to support IPv6 only
131	   servers to be contacted by IPv4 only clients, but only defines the
132	   mechanisms for IPv6 clients to contact IPv4 servers and its potential
133	   reuse to support peer to peer communications through standard NAT
134	   traversal techniques.  Second, NAT64 includes a set of features that
135	   overcomes many of the reasons the original NAT-PT specification was
136	   moved to historic status [RFC4966].

138	1.1.  Features of NAT64

140	   The features of NAT64 and DNS64 are:

142	   o  It enables IPv6-only nodes to initiate a client-server connection
143	      with an IPv4-only server, without needing any changes on either
144	      IPv4 or IPv6 nodes.  This works for the same class of applications
145	      that work through IPv4-to-IPv4 NATs.

147	   o  It supports peer-to-peer communication between IPv4 and IPv6
148	      nodes, including the ability for IPv4 nodes to initiate
149	      communcation with IPv6 nodes using peer-to-peer techniques (i.e.,
150	      using a rendezvous server and ICE).  To this end, NAT64 is
151	      compliant with the recommendations for how NATs should handle UDP
152	      [RFC4787], TCP [I-D.ietf-behave-tcp], and ICMP
153	      [I-D.ietf-behave-nat-icmp].

155	   o  Compatible with ICE.

157	   o  Supports additional features with some changes on nodes.  These
158	      features include:

160	      *  Support for DNSSec

162	      *  Some forms of IPSec support

164	      *  Increased ability to detect when there is a communication path
165	         that does not involve translating between IPv6 and IPv4.  This
166	         is achieved by marking synthetic DNS AAAA resource records
167	         which usage would result in translated connectivity, so that
168	         the sender can prefer using non-synthetic records when it is
169	         possible.

171	1.2.  Overview

173	   This section provides a non-normative introduction to the mechanisms
174	   of NAT64 and DNS64.

176	   NAT64 mechanism is implemented in an NAT64 box which has two
177	   interfaces, an IPv4 interface connected to the the IPv4 network, and
178	   an IPv6 interface connected to the IPv6 network.  Packets generated
179	   in the IPv6 network for a receiver located in the IPv4 network will
180	   be routed within the IPv6 network towards the NAT64 box.  The NAT64
181	   box will translate them and forward them as IPv4 packets through the
182	   IPv4 network to the IPv4 receiver.  The reverse takes place for
183	   packets generated in the IPv4 network for an IPv6 receiver.  NAT64,
184	   however, is not symmetric.  In order to be able to perform IPv6 -
185	   IPv4 translation NAT64 requires state, binding an IPv6 address and
186	   port (hereafter called an IPv6 transport address) to an IPv4 address
187	   and port (hereafter called an IPv4 transport address).

189	   Such binding state is created when the first packet flowing from the
190	   IPv6 network to the IPv4 network is translated.  After the binding
191	   state has been created, packets flowing in either direction on that
192	   particular flow are translated.  The result is that NAT64 only
193	   supports communications initiated by the IPv6-only node towards an
194	   IPv4-only node.  Some additional mechanisms, like ICE, can be used in
195	   combination with NAT64 to provide support for communications
196	   initiated by the IPv4-only node to the IPv6-only node.  The
197	   specification of such mechanisms, however, is out of the scope of
198	   this document.

200	1.2.1.  NAT64 solution elements

202	   In this section we describe the different elements involved in the
203	   NAT64 approach.

205	   The main component of the proposed solution is the translator itself.
206	   The translator has essentially two main parts, the address
207	   translation mechanism and the protocol translation mechanism.

209	   Protocol translation from IPv4 packet header to IPv6 packet header
210	   and vice-versa is performed according to SIIT [RFC2765].

212	   Address translation maps IPv6 transport addresses to IPv4 transport
213	   addresses and vice-versa.  In order to create these mappings the
214	   NAT64 box has two pools of addresses i.e. an IPv6 address pool (to
215	   represent IPv4 addresses in the IPv6 network) and an IPv4 address
216	   pool (to represent IPv6 addresses in the IPv4 network).  Since there
217	   is enough IPv6 address space, it is possible to map every IPv4
218	   address into a different IPv6 address.

220	   NAT64 creates the required mappings by using as the IPv6 address pool
221	   a /96 IPv6 prefix (hereafter called Pref64::/96).  This allows each
222	   IPv4 address to be mapped into a different IPv6 address by simply
223	   concatenating the /96 prefix assigned as the IPv6 address pool of the
224	   NAT64, with the IPv4 address being mapped (i.e. an IPv4 address X is
225	   mapped into the IPv6 address Pref64:X).  The NAT64 prefix Pref64::/96
226	   is assigned by the administrator of the NAT64 box from the global
227	   unicast IPv6 address block assigned to the site.  It should be noted
228	   that the the prefix used as the IPv6 address pool is assigned to a
229	   specific NAT64 box and if there are multiple NAT64 boxes, each box is
230	   allocated a different prefix.  Assigning the same prefix to multiple
231	   boxes may lead to communication failures due to internal routing
232	   fluctuations.

234	   The IPv4 address pool, however, is a set of IPv4 addresses, normally
235	   a small prefix assigned by the local administrator to the NAT64's
236	   external (IPv4) interface.  Since IPv4 address space is a scarce
237	   resource, the IPv4 address pool is small and typicaly not sufficient
238	   to establish permanent one-to-one mappings with IPv6 addresses.  So,
239	   mappings using the IPv4 address pool will be created and released
240	   dynamically.  Moreover, because of the IPv4 address scarcity, the
241	   usual practice for NAT64 is likely to be the mapping of IPv6
242	   transport addresses into IPv4 transport addresses, instead of IPv6
243	   addresses into IPv4 addresses directly, which enable a higher
244	   utilization of the limited IPv4 address pool.

246	   Because of the dynamic nature of the IPv6 to IPv4 address mapping and
247	   the static nature of the IPv4 to IPv6 address mapping, it is easy to
248	   understand that it is far simpler to allow communication initiated
249	   from the IPv6 side toward an IPv4 node, which address is permanently
250	   mapped into an IPv6 address, than communications initiated from IPv4-
251	   only nodes to an IPv6 node in which case IPv4 address needs to be
252	   associated with it dynamically.  For this reason NAT64 supports only
253	   communications initiated from the IPv6 side.

255	   An IPv6 initiator can know or derive in advance the IPv6 address
256	   representing the IPv4 target and send packets to that address.  The
257	   packets are intercepted by the NAT64 device, which associates an IPv4
258	   transport address of its IPv4 pool to the IPv6 transport address of
259	   the initiator, creating binding state, so that reply packets can be
260	   translated and forwarded back to the initiator.  The binding state is
261	   kept while packets are flowing.  Once the flow stops, and based on a
262	   timer, the IPv4 transport address is returned to the IPv4 address
263	   pool so that it can be reused for other communications.

265	   To allow an IPv6 initiator to do the standard DNS lookup to learn the
266	   address of the responder, DNS64 is used to synthesize an AAAA record
267	   (pronounced "quad-A" and containing an IPv6 address) from the A
268	   record (containing the real IPv4 address of the responder).  DNS64
269	   receives the DNS queries generated by the IPv6 initiator.  If there
270	   is no AAAA record available for the target node (which is the normal
271	   case when the target node is an IPv4-only node), DNS64 performs a
272	   query for the A record.  If an A record is discovered, DNS64 creates
273	   a synthetic AAAA RR by adding the Pref64::/96 of a NAT64 to the
274	   responder's IPv4 address (i.e. if the IPv4 node has IPv4 address X,
275	   then the synthetic AAAA RR will contain the IPv6 address formed as
276	   Pref64:X).  The synthetic AAAA RR is passed back to the IPv6
277	   initiator, which will initiate an IPv6 communication with the IPv6
278	   address associated to the IPv4 receiver.  The packet will be routed
279	   to the NAT64 device, which will create the IPv6 to IPv4 address
280	   mapping as described before.

282	   Having DNS synthesize AAAA records creates a number of problems, as
283	   described in [RFC4966]:

285	   o  The synthesized AAAA records may leak outside their intended
286	      scope;

288	   o  Dual-stack hosts may communicate with IPv4-only servers using IPv6
289	      which is then translated to IPv4, rather than using their IPv4
290	      connectivity;

292	   o  The IPv6-only hosts will be unable to use DNSSEC to verify the
293	      legitimacy of the synthetic AAAA records.

295	   In order to avoid these issues, responses containing synthesized
296	   addresses are tagged with an Extended DNS [RFC2671] option defined in
297	   this document, called the SAS option, so the AAAA records can be
298	   recognized as synthetic.  This allows caching nameservers, dual stack
299	   nodes and nodes implementing DNSSEC to ignore synthetic addresses and
300	   perform an additional request for the original address records.

302	1.2.2.  Walkthough

304	   In this example, we consider an IPv6 node located in a IPv6-only site
305	   that initiates a communication to a IPv4 node located in the IPv6
306	   Internet.

308	   The notation used is the following: upper case letters are IPv4
309	   addresses; upper case letters with a prime(') are IPv6 addresses;
310	   lower case letters are ports; prefixes are indicated by "P::X", which
311	   is a IPv6 address built from an IPv4 address X by adding the prefix
312	   P, mappings are indicated as "(X,x) <--> (Y',y)".

314	   The scenario for this case is depicted in the following figure:

316	      +---------------------------------------+       +-----------+
317	      |IPv6 site       +-------------+        |       |           |
318	      |  +----+        | Name server |   +-------+    |   IPv4    |
319	      |  | H1 |        | with DNS64  |   | NAT64 |----| Internet  |
320	      |  +----+        +-------------+   +-------+    +-----------+
321	      |    |IP addr: Y'     |              |  |            |IP addr: X
322	      |    ---------------------------------  |          +----+
323	      +---------------------------------------+          | H2 |
324	                                                         +----+

326	   The figure shows a IPv6 node H1 which has an IPv6 address Y' and an
327	   IPv4 node H2 with IPv4 address X.

329	   A NAT64 connects the IPv6 network to the IPv4 Internet.  This NAT64
330	   has a /96 prefix (called Pref64::/96) associated to its IPv6
331	   interface and an IPv4 address T assigned to its IPv4 interface.

333	   Also shown is a local name server with DNS64 functionality.  For the
334	   purpose of this example, we assume that the name server is a dual-
335	   stack node, so that H1 can contact it via IPv6, while it can contact
336	   IPv4-only name servers via IPv4.

338	   The local name server needs to know the /96 prefix assigned to the
339	   local NAT64 (Pref64::/96).  For the purpose of this example, we
340	   assume it learns this through manual configuration.

342	   For this example, assume the typical DNS situation where IPv6 hosts
343	   have only stub resolvers and the local name server does the recursive
344	   lookups.

346	   The steps by which H1 establishes communication with H2 are:

348	   1.  H1 does a DNS lookup for the IPv6 address of H2.  H1 does this by
349	       sending a DNS query for an AAAA record for H2 to the local name
350	       server.  Assume the local name server is implementing DNS64
351	       functionality.

353	   2.  The local DNS server resolves the query, and discovers that there
354	       are no AAAA records for H2.

356	   3.  The name server queries for a A record for H2 and gets back an A
357	       record containing the IPv4 address X. The name server then
358	       synthesizes an AAAA record.  The IPv6 address in the AAAA record
359	       contains the prefix assigned to the NAT64 in the first 96 bits
360	       and the IPv4 address X in the lower 32 bits.

362	   4.  The name server sends a response back to H1.  If H1 has
363	       indicated, in its query, that it supports the EDNS0, then the
364	       name server will use the SAS option to indicate that the AAAA
365	       record is synthetic.

367	   5.  H1 receives the synthetic AAAA record and sends a packet towards
368	       H2.  The packet is sent from a source transport address of (Y',y)
369	       to a destination transport address of (Pref64:X,x), where y and x
370	       are ports chosen by H2.

372	   6.  The packet is routed to the IPv6 interface of the NAT64 (since
373	       Pref64::/96 has been associated to this interface).

375	   7.  The NAT64 receives the packet and performs the following actions:

377	       *  The NAT64 selects an unused port t on its IPv4 address T and
378	          creates the mapping entry (Y',y) <--> (T,t)

380	       *  The NAT64 translates the IPv6 header into an IPv4 header using
381	          SIIT.

383	       *  The NAT64 includes (T,t) as source transport address in the
384	          packet and (X,x) as destination transport address in the
385	          packet.  Note that X is extracted directly from the lower 32
386	          bits of the destination IPv6 address of the received IPv6
387	          packet that is being translated.

389	       The NAT64 sends the translated packet out its IPv4 interface and
390	       the packet arrives at H2.

392	   8.  H2 node responds by sending a packet with destination transport
393	       address (T,t) and source transport address (X,x).

395	   9.  The packet is routed to the NAT64 box, which will look for an
396	       existing mapping containing (T,t).  Since the mapping (Y',y) <-->
397	       (T,t) exists, the NAT64 performs the following operations:

399	       *  The NAT64 translates the IPv4 header into an IPv6 header using
400	          SIIT.

402	       *  The NAT64 includes (Y',y) as source transport address in the
403	          packet and (Pref64:X,x) as destination transport address in
404	          the packet.  Note that X is extracted directly from the source
405	          IPv4 address of the received IPv4 packet that is being
406	          translated.

408	       The translated packet is sent out the IPv6 interface to H2.

410	   The packet exchange between H1 and H2 continues and packets are
411	   translated in the different directions as previously described.

413	   It is important to note that the translation still works if the IPv6
414	   initiator H1 learns the IPv4 address through some scheme other than a
415	   DNS look-up.  This is because the DNS64 processing does NOT result in
416	   any state installed in the NAT64 box and because the mapping of the
417	   IPv4 address into an IPv6 address is the result of concatenating the
418	   prefix defined within the site for this purpose (called Pref64::/96
419	   in this document) to the original IPv4 address.

421	1.2.3.  Dual stack nodes

423	   Nodes that have both IPv6 and IPv4 connectivity and are configured
424	   with an address for a DNS64 as their resolving nameserver may receive
425	   responses containing synthetic AAAA resource records.  If the node
426	   prefers IPv6 over IPv4, using the addresses in the synthetic AAAA RRs
427	   means that the node will attempt to communicate through the NAT64
428	   mechanism first, and only fall back to native IPv4 connectivity if
429	   connecting through NAT64 fails (if the application tries the full set
430	   of destination addresses).  To avoid this, dual stack nodes can
431	   ignore all replies to DNS requests that contain the EDNS SAS option,
432	   and use the destination addresses found in the responses for A
433	   resource record requests instead.

435	1.2.4.  IPv6 nodes implementing DNSSEC

437	   Synthesizing resource records is incompatible with DNSSEC.  So like
438	   dual stack nodes, IPv6 nodes implementing DNSSec must not use
439	   synthetic address records as indicated by the EDNS SAS option.  In
440	   this case, the node should perform the DNSSec validation on the
441	   original A RR and then locally synthesize the AAAA RR.  This
442	   basically means that the DNS64 functionality should be implemented in
443	   the local host for those hosts that want to be able to perform DNSSec
444	   validation.  In order to do that, hosts implementing DNS64
445	   functionality should be able to discover Pref64::/96 prefix that is
446	   needed to synthesize AAAA RR.  The means used to discover the prefix
447	   are out of the scope of this document.  So for the purposes of
448	   DNSSEC, the synthetic response doesn't exist, an IPv6 node
449	   implementing DNSSEC has to request the original A resource records
450	   and perform the normal DNSSEC validation steps.  When this is done,
451	   an IPv6 address is synthesized from the validated IPv4 address and
452	   the translator /96 prefix locally.

454	1.2.5.  Filtering

456	   A NAT64 box may do filtering, which means that it only allows a
457	   packet in through an interface if the appropriate permission exists.
458	   A NAT64 may do no filtering, or it may filter on its IPv4 interface.
459	   Filtering on the IPv6 interface is not supported, as mappings are
460	   only created by packets traveling in the IPv6 --> IPv4 direction.

462	   If a NAT64 filters on its IPv4 interface, then an incoming packet is
463	   dropped unless a packet has been recently sent out the interface with
464	   a destination IP address equal to the source IP address of the
465	   incoming packet.

467	   NAT64 filtering is consistent with the recommendations of RFC 4787.

469	2.  Terminology

471	   This section provides a definitive reference for all the terms used
472	   in document.

474	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
475	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
476	   document are to be interpreted as described in RFC 2119 [RFC2119].

478	   The following terms are used in this document:

480	   DNS64:  A logical function that synthesizes AAAA records (containing
481	      IPv6 addresses) from A records (containing IPv4 addresses).

483	   Synthetic RR:  A DNS resource record (RR) that is not contained in
484	      any zone data file, but has been synthesized from other RRs.  An
485	      example is a synthetic AAAA record created from an A record.

487	   SAS Option:  An Extended DNS (EDNS) option used in DNS responses.
488	      Its primary purpose is to indicate that the set of AAAA RR
489	      contained in a DNS response are synthetic.

491	   NAT64:  A device that translates IPv6 packets to IPv4 packets and
492	      vice-versa, with the provision that the communication must be
493	      initiated from the IPv6 side.  The translation involves not only
494	      the IP header, but also the transport header (TCP or UDP).

496	   Session:  A TCP or UDP session.  In other words, the bi-directional
497	      flow of packets between two ports on two different hosts.  In
498	      NAT64, typically one host is an IPv4 host, and the other one is an
499	      IPv6 host.

501	   5-Tuple:  The tuple (source IP address, source port, destination IP
502	      address, destination port, transport protocol).  A 5-tuple
503	      uniquely identifies a session.  When a session flows through a
504	      NAT64, each session has two different 5-tuples: one with IPv4
505	      addresses and one with IPv6 addresses.

507	   Session table:  A table of sessions kept by a NAT64.  Each NAT64 has
508	      two session tables, one for TCP and one for UDP.

510	   Transport Address:  The combination of an IPv6 or IPv4 address and a
511	      port.  Typically written as (IP address, port); e.g. (192.0.2.15,
512	      8001).

514	   Mapping:  A mapping between an IPv6 transport address and a IPv4
515	      transport address.  Used to translate the addresses and ports of
516	      packets flowing between the IPv6 host and the IPv4 host.  In
517	      NAT64, the IPv4 transport address is always a transport address
518	      assigned to the NAT64 itself, while the IPv6 transport address
519	      belongs to some IPv6 host.

521	   BIB:  Binding Information Base.  A table of mappings kept by a NAT64.
522	      Each NAT64 has two BIBs, one for TCP and one for UDP.

524	   Endpoint-Independent Mapping:  In NAT64, using the same mapping for
525	      all sessions between an IPv6 that have the same IPv6 transport
526	      address endpoint.  Endpoint-independent mapping is important for
527	      peer-to-peer communication.  See [RFC4787] for the definition of
528	      the different types of mappings in IPv4-to-IPv4 NATs.

530	   Hairpinning:  Having a packet do a "U-turn" inside a NAT and come
531	      back out the same interface as it arrived on.  Hairpinning support
532	      is important for peer-to-peer applications, as there are cases
533	      when two different hosts on the same side of a NAT can only
534	      communicate using sessions that hairpin though the NAT.

536	   For a detailed understand of this document, the reader should also be
537	   familiar with DNS terminology [RFC1035] and current NAT terminology
538	   [RFC4787].

540	3.  Normative Specification

542	3.1.  Synthentic AAAA RRs

544	   A synthentic RR is an RR that does not appear in the master zone
545	   file.

547	   The rules on the usage of synthetic AAAA RRs are:

549	      Synthetic AAAA RRs MAY be included in the answer section of a
550	      response.

552	      Synthetic AAAA RRs MUST NOT be included in sections other than the
553	      answer section.

555	      A synthetic AAAA RR MUST NOT be included if the responder knows of
556	      at least one non-synthetic RR of the same type and class.

558	      If a synthetic AAAA RR is included in the answer section, then all
559	      RRs included in the answer section MUST be synthetic.

561	      If a synthetic AAAA RR is _not_ explicitly marked as synthetic
562	      (using the SAS option), then its TTL MUST be 0.

564	      If a synthetic AAAA RR is explicitly marked as synthetic (using
565	      the SAS option), then its TTL SHOULD be 0.

567	   TBD: Can/should the AA bit be set in a response containing synthetic
568	   RRs?

570	   TBD: Do we always want synthetic RRs to have a TTL of 0?  Is it ever
571	   reasonable or desirable to cache them?

573	3.2.  The EDNS SAS option

575	   EDNS [RFC2671] defines a mechanism to add options to the DNS
576	   [RFC1035] protocol.  This section defines the SAS (Status of Answer
577	   Section) option that indicates the status (real or synethetic) of RRs
578	   in the answer section.

580	   The format of the SAS option is:
581	       0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
582	     +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
583	     |                          OPTION-CODE                          |
584	     +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
585	     |                         OPTION-LENGTH                         |
586	     +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
587	     |                                                               |
588	     /                          OPTION-DATA                          /
589	     |                                                               |
590	     +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

592	   The fields are defined as follows:

594	   o  OPTION-CODE: (to be allocated by IANA)

596	   o  OPTION-LENGTH: the size (in octets) of the OPTION-DATA part of the
597	      option

599	   o  OPTION-DATA: variable length field.  No values for this field are
600	      defined by this document.

602	   For any OPTION-DATA defined in the future, the maximum length of the
603	   OPTION-DATA field in the SAS option is 12 bytes, and any SAS option
604	   with a OPTION-LENGTH of more than 8 SHOULD be silently ignored.

606	   The rules on the usage of the SAS option are:

608	      A requestor that understands the SAS option SHOULD include the OPT
609	      RR in all queries.

611	      A responder can include the SAS option in a response only if the
612	      OPT RR appeared in the corresponding query.

614	      Any options not understood or not meaningful in the current
615	      context MUST be ignored.

617	      A responder MUST include the SAS option in the response if it
618	      knows that all the RRs in the answer section are synthetic.

620	   The presence of the OPT RR in a query indicates that the requestor
621	   understands the OPT extension.

623	3.3.  DNS64

625	   A DNS64 is a logical function that synthesizes AAAA records from A
626	   records.  The DNS64 function may be implemented in a resolver, in a
627	   local recursive name server, or in some other device such as a NAT64.

629	   The only configuration parameter required by the DNS64 is the /96
630	   IPv6 prefix assigned to a NAT64.  This prefix is used to map IPv4
631	   addresses into IPv6 addresses, and is denoted Pref64::/96.  The DNS64
632	   learns this prefix through some means not specified here.

634	   When the DNS64 receives a query for RRs of type AAAA and class IN, it
635	   firsts attempts to retrieve non-synthetic RRs of this type and class
636	   (where "non-synthetic RRs" means RRs not explicitly marked as
637	   synthetic).  If this query results in one or more AAAA records or in
638	   an error condition, this result is returned to the client as per
639	   normal DNS semantics.  If the query is successful, but doesn't return
640	   any answers, the DNS64 resolver executes a recursive A RR lookup for
641	   the name in question.  If this query results in an empty result or in
642	   an error, this result is returned to the client.  If the query
643	   results in one or more A RRs, the DNS64 synthesizes AAAA RRs based on
644	   the A RRs and the /96 prefix of the translator.  The synthetic AAAA
645	   RRs get a TTL of 0 second.  The DNS64 resolver then returns the
646	   synthesized AAAA records to the client.  If the client included the
647	   EDNS0 OPT RR in the query, the DNS64 resolver MUST include an EDNS0
648	   OPT RR that contains the SAS option.  When synthesizing the answer to
649	   a query for ANY, the DNS64 MUST include the A records from which the
650	   AAAA records were synthesized.

652	   To ensure endpoint-independent mapping behavior, a given IPv6 host
653	   must always use the same NAT64.  This, in turn, means that any
654	   synthetic AAAA records used by the host must always use the same
655	   prefix.  To ensure this, if a DNS64 has multiple Pref64::/96 prefixes
656	   configured, it SHOULD ensure that the same prefix is used for all
657	   AAAA records returned to a given host across all queries.  A
658	   reasonable exception would be when the DNS64 knows, through some
659	   unspecified means, that the NAT64 associated with a Pref64::/96
660	   prefix is no longer functional.

662	   Furthermore, it is highly desirable to synthesize the AAAA records as
663	   close as possible to the host that will use them.  This helps ensure
664	   that a given host always uses the same NAT64.

666	   The DNS64 MUST obey the rules for synthetic RRs (Section 3.1) and the
667	   SAS option (Section 3.2).

669	   A synthetic AAAA record is created from an A record as follows:

671	   o  The NAME field is set to the NAME field from the A record

673	   o  The TYPE field is set to 28 (AAAA)

675	   o  The CLASS field is set to 1 (IN)

677	   o  The TTL field is set as described in Section 3.1

679	   o  The RDLENGTH field is set to 16

681	   o  The RDATA field is set to the IPv6 address whose upper 96 bits are
682	      Pref64::/96 and whose lower 32 bits are the IPv4 address from the
683	      RDATA field of the A record.

685	   TBD: What does a DNS64 do when a query for an A record returns a
686	   CNAME record and an A record?  The SAS option, as currently defined,
687	   flags ALL records in the answer section as synthetic.  Does the DNS64
688	   return just a CNAME record?  Does it return just an AAAA record?  Or
689	   does it return a real CNAME record and a synthetic AAAA record in the
690	   answer section -- something that the current rules do not allow.

692	3.4.  NAT64

694	   A NAT64 is a device with one IPv6 interface and one IPv4 interface.
695	   The IPv6 interface MUST have a unicast /96 IPv6 prefix assigned to
696	   it, denoted Pref64::/96.  The IPv4 interface MUST have one or more
697	   unicast IPv4 addresses assigned to it.

699	   A NAT64 uses the following dynamic data structures:

701	   o  UDP BIB

703	   o  UDP Session Table

705	   o  TCP BIB

707	   o  TCP Session Table

709	   A NAT64 has two Binding Information Bases: one for TCP and one for
710	   UDP.  Each BIB entry specifies a mapping between an IPv6 transport
711	   address and an IPv4 transport address:

713	      (X',x) <--> (T,t)

715	   where X' is some IPv6 address, T is an IPv4 address, and x and t are
716	   ports.  T will always be one of the IPv4 addresses assigned to the
717	   IPv4 interface of the NAT64.  A given IPv6 or IPv4 transport address
718	   can appear in at most one entry in a BIB: for example, (2001:db8::17,
719	   4) can appear in at most one TCP and at most one UDP BIB entry.  TCP
720	   and UDP have separate BIBs because the port number space for TCP and
721	   UDP are distinct.

723	   A NAT64 also has two session tables: one for TCP sessions and one for
724	   UDP sessions.  Each entry keeps information on the state of the
725	   corresponding session: see Section 3.4.2.  The NAT64 uses the session
726	   state information to determine when the session is completed, and
727	   also uses session information for ingress filtering.  A session can
728	   be uniquely identified by either an incoming 5-tuple or an outgoing
729	   5-tuple.

731	   For each session, there is a corresponding BIB entry, uniquely
732	   specified by either the source IPv6 transport address (in the IPv6
733	   --> IPv4 direction) or the destination IPv4 transport address (in the
734	   IPv4 --> IPv6 direction).  However, a single BIB entry can have
735	   multiple corresponding sessions.  When the last corresponding session
736	   is deleted, the BIB entry is deleted.

738	   The processing of an incoming IP packet takes the following steps:

740	   1.  Determining the incoming 5-tuple

742	   2.  Filtering and updating session information

744	   3.  Computing the outgoing 5-tuple

746	   4.  Translating the packet

748	   5.  Handling hairpinning

750	   The details of these steps are specified in the following
751	   subsections.

753	   This breakdown of the NAT64 behavior into processing steps is done
754	   for ease of presentation.  A NAT64 MAY perform the steps in a
755	   different order, or MAY perform different steps, as long as the
756	   externally visible outcome in the same.

758	   TBD: Add support for ICMP Query packets.  (ICMP Error packets are
759	   handled).

761	3.4.1.  Determining the Incoming 5-tuple

763	   This step associates a incoming 5-tuple (source IP address, source
764	   port, destination IP address, destination port, transport protocol)
765	   with every incoming IP packet for use in subsequent steps.

767	   If the incoming IP packet contains a complete (un-fragmented) UDP or
768	   TCP protocol packet, then the 5-tuple is computed by extracting the
769	   appropriate fields from the packet.

771	   If the incoming IP packet contains a complete (un-fragmented) ICMP
772	   message, then the 5-tuple is computed by extracting the appropriate
773	   fields from the IP packet embedded inside the ICMP message.  However,
774	   the role of source and destination is swapped when doing this: the
775	   embedded source IP address becomes the destination IP address in the
776	   5-tuple, the embedded source port becomes the destination port in the
777	   5-tuple, etc.  If it is not possible to determine the 5-tuple
778	   (perhaps because not enough of the embedded packet is reproduced
779	   inside the ICMP message), then the incoming IP packet is silently
780	   discarded.

782	      NOTE: The transport protocol is always one of TCP or UDP, even if
783	      the IP packet contains an ICMP message.

785	   If the incoming IP packet contains a fragment, then more processing
786	   may be needed.  This specification leaves open the exact details of
787	   how a NAT64 handles incoming IP packets containing fragments, and
788	   simply requires that a NAT64 handle fragments arriving out-of-order.
789	   A NAT64 MAY elect to queue the fragments as they arrive and translate
790	   all fragments at the same time.  Alternatively, a NAT64 MAY translate
791	   the fragments as they arrive, by storing information that allows it
792	   to compute the 5-tuple for fragments other than the first.  In the
793	   latter case, the NAT64 will still need to handle the situation where
794	   subsequent fragments arrive before the first.

796	   Implementors of NAT64 should be aware that there are a number of
797	   well-known attacks against IP fragmentation; see [RFC1858] and
798	   [RFC3128].

800	   Assuming it otherwise has sufficient resources, a NAT64 MUST allow
801	   the fragments to arrive over a time interval of at least 10 seconds.
802	   A NAT64 MAY require that the UDP, TCP, or ICMP header be completely
803	   contained within the first fragment.

805	3.4.2.  Filtering and Updating Session Information

807	   This step updates the per-session information stored in the
808	   appropriate session table.  This affects the lifetime of the session,
809	   which in turn affects the lifetime of the corresponding BIB entry.
810	   This step may also filter incoming packets, if desired.

812	   The details of this step depend on the transport protocol (UDP or
813	   TCP).

815	3.4.2.1.  UDP Session Handling

817	   The state information stored for a UDP session is a timer that tracks
818	   the remaining lifetime of the UDP session.  The NAT64 decrements this
819	   timer at regular intervals.  When the timer expires, the UDP session
820	   is deleted.

822	   The incoming packet is processed as follows:

824	   1.  If the packet arrived on the IPv4 interface and the NAT64 filters
825	       on its IPv4 interface, then the NAT64 checks to see if the
826	       incoming packet is allowed according to the address-dependent
827	       filtering rule.  To do this, it searches for a session table
828	       entry with a source IPv4 address equal to the source IPv4 address
829	       in the incoming 5-tuple.  If such an entry is found (there may be
830	       more than one), packet processing continues.  Otherwise, the
831	       packet is discarded.  If the packet is discarded, then an ICMP
832	       message SHOULD be sent to the original sender of the packet,
833	       unless the discarded packet is itself an ICMP message.  The ICMP
834	       message, if sent, has a type of 3 (Destination Unreachable) and a
835	       code of 13 (Communication Administratively Prohibited).

837	   2.  The NAT64 searches for the session table entry corresponding to
838	       the incoming 5-tuple.  If no such entry if found, a new entry is
839	       created.

841	   3.  The NAT64 sets or resets the timer in the session table entry to
842	       maximum session lifetime.  By default, the maximum session
843	       lifetime is 5 minutes, but for specific destination ports in the
844	       Well-Known port range (0..1023), the NAT64 MAY use a smaller
845	       maximum lifetime.

847	3.4.2.2.  TCP Session Handling

849	   TBD: Describe the state machine required to track the state of the
850	   TCP session.  This is a simplified version of the state machine used
851	   by the endpoints.

853	3.4.3.  Computing the Outgoing 5-Tuple

855	   This step computes the outgoing 5-tuple by translating the addresses
856	   and ports in the incoming 5-tuple.  The transport protocol in the
857	   outgoing 5-tuple is always the same as that in the incoming 5-tuple.

859	   In the text below, a reference to the "the BIB" means either the TCP
860	   BIB or the UDP BIB as appropriate, as determined by the transport
861	   protocol in the 5-tuple.

863	      NOTE: Not all addresses are translated using the BIB.  BIB entries
864	      are used to translate IPv6 source transport addresses to IPv4
865	      source transport addresses, and IPv4 destination transport
866	      addresses to IPv6 destination transport addresses.  They are NOT
867	      used to translate IPv6 destination transport addresses to IPv4
868	      destination transport addresses, nor to translate IPv4 source
869	      transport addresses to IPv6 source transport addresses.  The
870	      latter cases are handled by adding or removing the /96 prefix.
871	      This distinction is important; without it, hairpinning doesn't
872	      work correctly.

874	   When translating in the IPv6 --> IPv4 direction, let the incoming
875	   source and destination transport addresses in the 5-tuple be (S',s)
876	   and (D',d) respectively.  The outgoing source transport address is
877	   computed as follows:

879	      If the BIB contains a entry (S',s) <--> (T,t), then the outgoing
880	      source transport address is (T,t).

882	      Otherwise, create a new BIB entry (S',s) <--> (T,t) as described
883	      below.  The outgoing source transport address is (T,t).

885	   The outgoing destination address is computed as follows:

887	      If D' is composed of the NAT64's prefix followed by an IPv4
888	      address D, then the outgoing destination transport address is
889	      (D,d).

891	      Otherwise, discard the packet.

893	   When translating in the IPv4 --> IPv6 direction, let the incoming
894	   source and destination transport addresses in the 5-tuple be (S,s)
895	   and (D,d) respectively.  The outgoing source transport address is
896	   computed as follows:

898	      The outgoing source transport address is (Pref64::S,s).

900	   The outgoing destination transport address is computed as follows:

902	      If the BIB contains an entry (X',x) <--> (D,d), then the outgoing
903	      destination transport address is (X',x).

905	      Otherwise, discard the packet.

907	   If the rules specify that a new BIB entry is created for a source
908	   transport address of (S',s), then the NAT64 allocates an IPv4
909	   transport address for this BIB entry as follows:

911	      If there exists some other BIB entry containing S' as the IPv6
912	      address and mapping it to some IPv4 address T, then use T as the
913	      IPv4 address.  Otherwise, use any IPv4 address assigned to the
914	      IPv4 interface.

916	      If the port s is in the Well-Known port range 0..1023, then
917	      allocate a port t from this same range.  Otherwise, if the port s
918	      is in the range 1024..65535, then allocate a port t from this
919	      range.  Furthermore, if port s is even, then t must be even, and
920	      if port s is odd, then t must be odd.

922	      In all cases, the allocated IPv4 transport address (T,t) MUST NOT
923	      be in use in another entry in the same BIB, but MAY be in use in
924	      the other BIB.

926	   If it is not possible to allocate an appropriate IPv4 transport
927	   address or create a BIB entry for some reason, then the packet is
928	   discarded.

930	   TBD: Do we delete the session entry if we cannot create a BIB entry?

932	   If the rules specify that the packet is discarded, then the NAT64
933	   SHOULD send an ICMP reply to the original sender, unless the packet
934	   being translated contains an ICMP message.  The type should be 3
935	   (Destination Unreachable) and the code should be 0 (Network
936	   Unreachable in IPv4, and No Route to Destination in IPv6).

938	3.4.4.  Translating the Packet

940	   This step translates the packet from IPv6 to IPv4 or vica-versa.

942	   The translation of the packet is as specified in section 3 and
943	   section 4 of SIIT [RFC2765], with the following modifications:

945	   o  When translating an IP header (sections 3.1 and 4.1), the source
946	      and destination IP address fields are set to the source and
947	      destination IP addresses from the outgoing 5-tuple.

949	   o  When the protocol following the IP header is TCP or UDP, then the
950	      source and destination ports are modified to the source and
951	      destination ports from the 5-tuple.  In addition, the TCP or UDP
952	      checksum must also be updated to reflect the translated addresses
953	      and ports; note that the TCP and UDP checksum covers the pseudo-
954	      header which contains the source and destination IP addresses.  An
955	      algorithm for efficently updating these checksums is described in
956	      [RFC3022].

958	   o  When the protocol following the IP header is ICMP (sections 3.4
959	      and 4.4) the source and destination transport addresses in the
960	      embedded packet are set to the destination and source transport
961	      addresses from the outgoing 5-tuple (note the swap of source and
962	      destination).

964	3.4.5.  Handling Hairpinning

966	   This step handles hairpinning if necessary.

968	   If the destination IP address is an address assigned to the NAT64
969	   itself (i.e., is one of the IPv4 addresses assigned to the IPv4
970	   interface, or is covered by the /96 prefix assigned to the IPv6
971	   interface), then the packet is a hairpin packet.  The outgoing
972	   5-tuple becomes the incoming 5-tuple, and the packet is treated as if
973	   it was received on the outgoing interface.  Processing of the packet
974	   continues at step 2.

976	   TBD: Is there such a thing as a hairpin loop (likely not naturally,
977	   but perhaps through a special-crafted attack packet with a spoofed
978	   source address)?  If so, need to drop packets that hairpin more than
979	   once.

981	3.5.  FTP ALG

983	   TBD: Describe the FTP ALG, a mechanism for translating the embedded
984	   IP addresses inside FTP commands, that enables FTP sessions to pass
985	   through NAT64.

987	4.  Application scenarios

989	   In this section, we describe how to apply NAT64/DNS64 to the suitable
990	   scenarios described in draft-arkko-townsley-coexistence.

992	4.1.  Enterprise IPv6 only network

994	   The Enterprise IPv6 only network basically has IPv6 hosts (those that
995	   are currently available) and because of different reasons including
996	   operational simplicity, wants to run those hosts in IPv6 only mode,
997	   while still providing access to the IPv4 Internet.  The scenario is
998	   depicted in the picture below.

1000	                                +----+                  +-------------+
1001	                                |    +------------------+IPv6 Internet+
1002	                                |    |                  +-------------+
1003	      IPv6 host-----------------+ GW |
1004	                                |    |                  +-------------+
1005	                                |    +------------------+IPv4 Internet+
1006	                                +----+                  +-------------+

1008	      |-------------------------public v6-----------------------------|
1009	      |-------public v6---------|NAT|----------public v4--------------|

1011	   The proposed NAT64/DNS64 is perfectly suitable for this particular
1012	   scenario.  The deployment of the NAT64/DNS64 would be as follows: The
1013	   NAT64 function should be located in the GW device that connects the
1014	   IPv6 site to the IPv4 Internet.  The DNS64 functionality can be
1015	   placed in different places.  Probably the best trade-off between
1016	   architectural cleanness deployment simplicity would be to place it in
1017	   the local recursive DNS server of the enterprise site.  The option
1018	   that is easier to deploy would be to co-locate it with the NAT64 box.
1019	   The cleanest option would be included in the local resolver of the
1020	   IPv6 hosts, but this option seems the harder to deploy cause it
1021	   implies changes to the hosts.

1023	   The proposed NAT64/DNS64 approach satisfies the requirements of this
1024	   scenario, in particular cause it doesn't require any changes to
1025	   current IPv6 hosts in the site to obtain basic functionality.

1027	4.2.  Reaching servers in private IPv4 space

1029	   The scenario of servers using IPv4 private addresses and being
1030	   reached from the IPv6 Internet basically includes the cases that for
1031	   whatever reason the servers cannot be upgraded to IPv6 and they don't
1032	   have public VIPv4 addresses and it would be useful to allow IPv6
1033	   nodes in the IPv6 Internet to reach those servers.  This scenario is
1034	   depicted in the figure below.

1036	                                     +----+
1037	   IPv6 Host(s)-------(Internet)-----+ GW +------Private IPv4 Servers
1038	                                     +----+

1040	   |---------public v6---------------|NAT|------private v4----------|

1042	   This scenario can again be perfectly served by the NAT64 approach.
1043	   In this case the NAT64 functionality is placed in the GW device
1044	   connecting the IPv6 Internet to the server's site.  In this case, the
1045	   DNS64 functionality is not needed.  Since the server's site is
1046	   running the NAT64 and the servers, it can publish in its own DNS
1047	   server the AAAA RR corresponding to the servers i.e.  AAAA RR
1048	   associating the FQDN of the server and the Pref64:ServerIPv4Addr.  In
1049	   this case, there is no need to synthesize AAAA RR cause the site can
1050	   configure them in the DNS itself.

1052	   Again, this scenario is satisfied by the NAT64 since it supports the
1053	   required functionality without requiring changes in the IPv4 servers
1054	   nor in the IPv6 clients.

1056	5.  Discussion

1058	5.1.  About the Prefix used to map the IPv4 address space into IPv6

1060	   In the NAT64 approach, we need to represent the IPv4 addresses in the
1061	   IPv6 Internet.  Since there is enough address space in IPv6, we can
1062	   easily embed the IPv4 address into an IPv6 address, so that the IPv4
1063	   address information can be extracted from the IPv6 address without
1064	   requiring additional state.  One way to that is to use an IPv6 prefix
1065	   Pref64::/96 and juxtapose the IPv4 address at the end (there are
1066	   other ways of doing it, but we are not discussing the different
1067	   formats here).  In this document the Pref64::/96 prefix is extracted
1068	   from the address block assigned to the site running the NAT64 box.
1069	   However, one could envision the usage of other prefixes for that
1070	   function.  In particular, it would be possible to define a well-known
1071	   prefix that can be used by the NAT64 devices to map IPv4 (public)
1072	   addresses into IPv6 addresses, irrespectively of the address space of
1073	   the site where the NAT64 is located.  In this section, we discuss the
1074	   pro and cons of the different options.

1076	   the different options for Pref64::/96 are the following

1078	      Local: A locally assigned prefix out of the address block of the
1079	      site running the NAT64 box

1081	      Well-known: A well know prefix that is reserved for this purpose.
1082	      We have the following different options:

1084	         IPv4 mapped prefix

1086	         IPv4 compatible prefix

1088	         A new prefix assigned by IANA for this purpose

1090	   The reasons why using a well-known prefix is attractive are the
1091	   following: Having a global well-know prefix would allow to identify
1092	   which addresses are "real" IPv6 addresses with native connectivity
1093	   and which addresses are IPv6 addresses that represent an IPv4
1094	   address.  From an architectural perspective, it seems the right thing
1095	   to do to make this visible since hosts an applications could react
1096	   accordingly and avoid or prefer such type of connectivity if needed.
1097	   From the DNS64 perspective, using the well-know prefix would imply
1098	   that the same synthetic AAAA RR will be created throughout the IPv6
1099	   Internet, which would result in consistent view of the RR
1100	   irrespectively of the location in the topology.  From a more
1101	   practical perspective, having a well-know prefix would allow to
1102	   completely decouple the DNS64 from the NAT64, since the DNS64 would
1103	   always use the well-know prefix to create the synthetic AAAA RR and
1104	   there is no need to configure the same Pref64::/96 both in the DNS64
1105	   and the NAT64 that work together.

1107	   Among the different options available for the well-know prefix, the
1108	   option of using a pre-existing prefix such as the IPv4-mapped or
1109	   IPv4-compatible prefix has the advantage that would potentially allow
1110	   the default selection of native connectivity over translated
1111	   connectivity for legacy hosts in communications involving dual-stack
1112	   hosts.  This is because current RFC3484 default policy table include
1113	   entries for the IPv4-mapped prefix and the IPv4-compatible prefix,
1114	   implying that native IPv6 prefixes will be preferred over these.
1115	   However, current implementations do not use the IPv4-mapped prefix on
1116	   the wire, beating the purpose of support unmodified hosts.  The IPv4-
1117	   compatible prefix is used by hosts on the wire, but has a higher
1118	   priority than the IPv4-mapped prefix, which implies that current
1119	   hosts would prefer translated connectivity over native IPv4
1120	   connectivity (represented by the IPv4-mapped prefix in the default
1121	   policy table).  So neither of the prefixes that are present in the
1122	   default policy table would result in the legacy hosts preferring
1123	   native connectivity over translated connectivity, so it doesn't seem
1124	   to be a compelling reason to re-use neither the IPv4-mapped not the
1125	   IPv4-compatible prefix for this.  So, we conclude that among the the
1126	   well know prefix options, the preferred option would be to ask for a
1127	   new prefix from IANA to be allocated for this.

1129	   However, there are several issues when considering using the well-
1130	   know prefix option, namely:

1132	      The well-know prefix is suitable only for mapping IPv4 public
1133	      addresses into IPv6.  IPv4 public addresses can be mapped using
1134	      the same prefix cause they are globally unique.  However, the
1135	      well-known prefix is not suitable for mapping IPv4 private
1136	      addresses.  This is so because we cannot leverage on the
1137	      uniqueness of the IPv4 address to achieve uniqueness of the IPv6
1138	      address, so we need to use a different IPv6 prefix to disambiguate
1139	      the different private IPv4 address realms.  As we describe above,
1140	      there is a clear use case for mapping IPv4 private addresses, so
1141	      there is a pressing need to map IPv4 private addresses.  In order
1142	      to do so we will need to use at least for IPv4 private addresses,
1143	      IPv6 local prefixes.  In that case, the architectural goal of
1144	      distinguishing the "real" IPv6 addresses from the IPv6 addresses
1145	      that represent IPv4 addresses can no longer be achieved in a
1146	      general manner, making this option less attractive.

1148	      The usage of a single well-known prefix to map IPv4 addresses
1149	      irrespectively of the NAT64 used, may results in failure modes in
1150	      sites that have more than one NAT64 device.  The main problem is
1151	      that intra-site routing fluctuations that result in packets of an
1152	      ongoing communication flow through a different NAT64 box that the
1153	      one they were initially using (e.g. a change in an ECMP load
1154	      balancer), would break ongoing communications.  This is so because
1155	      the different NAT64 boxes will use a different IPv4 address, so
1156	      the IPv4 peer of the communications will receive packets coming
1157	      from a different IPv4 address.  This is avoided using a local
1158	      address, since each NAT64 box can have a different Pref64::/06
1159	      associated, to routing fluctuations would not result in using a
1160	      different NAT64 box.

1162	      The usage of a well-known prefix is also problematic in the case
1163	      that different routing domains want to exchange routing
1164	      information involving these routes.  Consider the case of an IPv6
1165	      site that has multiple providers and that each of these providers
1166	      provides access to the IPv4 Internet using the well know prefix.
1167	      Consider the hypothetical case that different parts of the IPv4
1168	      Internet are reachable through different IPv6 ISPs (yes, this
1169	      means that in a futuristic scenario, the IPv4 Internet is
1170	      partitioned).  In order to reach the different parts through the
1171	      different ISPs, more specific routes representing the different
1172	      IPv4 destinations reachable need to be injected in the IPv6 sites.
1173	      This basically means that such configuration would imply to import
1174	      the IPv4 routing entropy into the IPv6 routing system.  If
1175	      different local prefixes are used, then each ISP only announces
1176	      its own local prefix, and then the burden of defining which IPv4
1177	      destination is reachable through which ISP is placed somewhere
1178	      else (e.g. in the DNS64).

1180	6.  Security Considerations

1182	   Implications on end-to-end security, IPSec and TLS.

1184	   Any protocol that protect IP header information are essentially
1185	   incompatible with NAT64.  So, this implies that end to end IPSec
1186	   verification will fail when AH is used (both transport and tunnel
1187	   mode) and when ESP is used in transport mode.  This is inherent to
1188	   any network layer translation mechanism.  End-to-end IPsec protection
1189	   can be restored, using UDP encapsulation as described in [RFC2765].

1191	   TBD: TLS implications

1193	   Implications on DNS security and DNSSec.

1195	   NAT64 uses synthetic DNS RR to enable IPv6 clients to initiate
1196	   communications with IPv4 servers using the DNS.  This essentially
1197	   means that the DNS64 component generates synthetic AAAA RR that are
1198	   not contained in the master zone file.  From a DNSSec perspective,
1199	   this means that the straight DNSSec verification of such RR would
1200	   fail.  However, it is possible to restore DNSSec functionality if the
1201	   verification is performed right before the DNS64 processing directly
1202	   using the original A RR of the IPv4 server.  So, in order to jointly
1203	   use the NAT64 appraoch described in thei specification and DNSSec
1204	   validation, the DNS64 functionality should be performed in the
1205	   resolver of the IPv6 client.  In this case, the IPv6 client would
1206	   receive the original A RR with DNSSec information and it would first
1207	   perform the DNSSec validation.  If it is succcessful, it would then
1208	   proceed the synthetize the AAAA RR according to the mechanism
1209	   described in this document.  It should be noted that the synthetic
1210	   AAAA RR would stay within the IPv6 client and it would not leak
1211	   outside, making further DNSSec validations unnecesary.

1213	   Filtering.

1215	   NAT64 creates binding state using packets flowing from the IPv6 side
1216	   to the IPv4 side.  So, NAT64 implements by definition, at least,
1217	   endpoint independent filtering, meaning that in order to enable any
1218	   packet to flow from the IPv4 side to the IPv6 side, there must have
1219	   been a packet flowing from the IPv6 side to the IPv4 side the created
1220	   the binding information to be used for packets in the other
1221	   direction.  Endpoint independent filtering allows that once a binding
1222	   is created, it can be used by any node on the IPv4 side to send
1223	   packets to the IPv6 transport address that created the binding.  This
1224	   basically means that as long a the IPv6 node does not open a hole in
1225	   the NAT64, incoming communications are blocked and that once that the
1226	   IPv6 node has sent the first packet, this packet opens the door for
1227	   any node on the IPv4 side to send packets to that IPv6 transport
1228	   address.  It is possible to configure the NAT64 to implement more
1229	   stringent security policy, if endpoint independent mapping is
1230	   considered not secure enough.  In particular, if the security policy
1231	   of the NAT64 requires it, is it possible to configure the NAT64 to
1232	   perform address dependent filtering.  This basically means that the
1233	   binding state created can only be used by to send packets from the
1234	   IPv4 address to which the original packet that created the binding
1235	   was sent to.  This basically means that the door is open only for
1236	   that IPv4 address to send packet to the IPv6 transport address.

1238	   Attacks to NAT64.

1240	   The NAT64 device itself is a potential victim of different type of
1241	   attacks.  In particular, the NAT64 can be a victim of DoS attacks.
1242	   The NAT64 box has a limited number of resources that can be consumed
1243	   by attackers creating a DoS attack.  The NAT64 has a limited number
1244	   of IPv4 address that is uses to create the bindings.  Even though the
1245	   NAT64 performs address and port translation, it is possible for an
1246	   attacker to consume all the IPv4 transport addresses by sending IPv6
1247	   packets with different source IPv6 transport address.  It should be
1248	   noted that this attack can only be launched from the IPv6 side, since
1249	   IPv4 packets are not used to create binding state.  DoS attacks can
1250	   also affect other limited resource available in the NAT64 such as
1251	   memory or link capacity.  For instance, if the NAT64 implements
1252	   reassembly of fragmented packets, it is possible for an attacker to
1253	   launch a DoS attack to the memory of the NAT64 device by sending
1254	   fragments that the NAT64 will store for a given period.  If the
1255	   number of fragments if high enough, the memory of the NAT64 could be
1256	   exhausted.  NAT64 devices should implement proper protection against
1257	   such attacks, for instance allocating a limited amount of memory for
1258	   fragmented packet storage.

1260	7.  IANA Considerations

1262	   The IANA is requested to assign an EDNS Option Code value for the SAS
1263	   option.

1265	   TBD: Set up an IANA registry for SAS flags??

1267	8.  Changes from Previous Draft Versions

1269	   Note to RFC Editor: Please remove this section prior to publication
1270	   of this document as an RFC.

1272	   [[This section lists the changes between the various versions of this
1273	   draft.]]

1275	9.  Contributors

1277	      George Tsirtsis
1278	      Qualcomm

1280	      tsirtsis@googlemail.com

1282	10.  Acknowledgements

1284	   Dave Thaler, Dan Wing, Alberto Garcia-Martinez and Joao Damas
1285	   reviewed the document and provided useful comments to improve it.

1287	   The content of the draft was improved thanks to discussions with Fred
1288	   Baker and Jari Arkko.

1290	   Marcelo Bagnulo and Iljitsch van Beijnum are partly funded by
1291	   Trilogy, a research project supported by the European Commission
1292	   under its Seventh Framework Program.

1294	11.  References

1296	11.1.  Normative References

1298	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1299	              Requirement Levels", BCP 14, RFC 2119, March 1997.

1301	   [RFC1035]  Mockapetris, P., "Domain names - implementation and
1302	              specification", STD 13, RFC 1035, November 1987.

1304	   [RFC2671]  Vixie, P., "Extension Mechanisms for DNS (EDNS0)",
1305	              RFC 2671, August 1999.

1307	   [RFC2765]  Nordmark, E., "Stateless IP/ICMP Translation Algorithm
1308	              (SIIT)", RFC 2765, February 2000.

1310	   [RFC4787]  Audet, F. and C. Jennings, "Network Address Translation
1311	              (NAT) Behavioral Requirements for Unicast UDP", BCP 127,
1312	              RFC 4787, January 2007.

1314	   [I-D.ietf-behave-tcp]
1315	              Guha, S., Biswas, K., Ford, B., Sivakumar, S., and P.
1316	              Srisuresh, "NAT Behavioral Requirements for TCP",
1317	              draft-ietf-behave-tcp-08 (work in progress),
1318	              September 2008.

1320	   [I-D.ietf-behave-nat-icmp]
1321	              Srisuresh, P., Ford, B., Sivakumar, S., and S. Guha, "NAT
1322	              Behavioral Requirements for ICMP protocol",
1323	              draft-ietf-behave-nat-icmp-08 (work in progress),
1324	              June 2008.

1326	11.2.  Informative References

1328	   [RFC2766]  Tsirtsis, G. and P. Srisuresh, "Network Address
1329	              Translation - Protocol Translation (NAT-PT)", RFC 2766,
1330	              February 2000.

1332	   [RFC1858]  Ziemba, G., Reed, D., and P. Traina, "Security
1333	              Considerations for IP Fragment Filtering", RFC 1858,
1334	              October 1995.

1336	   [RFC3128]  Miller, I., "Protection Against a Variant of the Tiny
1337	              Fragment Attack (RFC 1858)", RFC 3128, June 2001.

1339	   [RFC3022]  Srisuresh, P. and K. Egevang, "Traditional IP Network
1340	              Address Translator (Traditional NAT)", RFC 3022,
1341	              January 2001.

1343	   [RFC4966]  Aoun, C. and E. Davies, "Reasons to Move the Network
1344	              Address Translator - Protocol Translator (NAT-PT) to
1345	              Historic Status", RFC 4966, July 2007.

1347	   [I-D.ietf-mmusic-ice]
1348	              Rosenberg, J., "Interactive Connectivity Establishment
1349	              (ICE): A Protocol for Network Address  Translator (NAT)
1350	              Traversal for Offer/Answer Protocols",
1351	              draft-ietf-mmusic-ice-19 (work in progress), October 2007.

1353	   [RFC3498]  Kuhfeld, J., Johnson, J., and M. Thatcher, "Definitions of
1354	              Managed Objects for Synchronous Optical Network (SONET)
1355	              Linear Automatic Protection Switching (APS)
1356	              Architectures", RFC 3498, March 2003.

1358	Authors' Addresses

1360	   Marcelo Bagnulo
1361	   UC3M
1362	   Av. Universidad 30
1363	   Leganes, Madrid  28911
1364	   Spain

1366	   Phone: +34-91-6249500
1367	   Fax:
1368	   Email: marcelo@it.uc3m.es
1369	   URI:   http://www.it.uc3m.es/marcelo
1370	   Philip Matthews
1371	   Unaffiliated

1373	   Email: philip_matthews@magma.ca
1374	   URI:

1376	   Iljitsch van Beijnum
1377	   IMDEA Networks
1378	   Av. Universidad 30
1379	   Leganes, Madrid  28911
1380	   Spain

1382	   Phone: +34-91-6246245
1383	   Email: iljitsch@muada.com

1385	Full Copyright Statement

1387	   Copyright (C) The IETF Trust (2008).

1389	   This document is subject to the rights, licenses and restrictions
1390	   contained in BCP 78, and except as set forth therein, the authors
1391	   retain all their rights.

1393	   This document and the information contained herein are provided on an
1394	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
1395	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
1396	   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
1397	   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
1398	   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1399	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

1401	Intellectual Property

1403	   The IETF takes no position regarding the validity or scope of any
1404	   Intellectual Property Rights or other rights that might be claimed to
1405	   pertain to the implementation or use of the technology described in
1406	   this document or the extent to which any license under such rights
1407	   might or might not be available; nor does it represent that it has
1408	   made any independent effort to identify any such rights.  Information
1409	   on the procedures with respect to rights in RFC documents can be
1410	   found in BCP 78 and BCP 79.

1412	   Copies of IPR disclosures made to the IETF Secretariat and any
1413	   assurances of licenses to be made available, or the result of an
1414	   attempt made to obtain a general license or permission for the use of
1415	   such proprietary rights by implementers or users of this
1416	   specification can be obtained from the IETF on-line IPR repository at
1417	   http://www.ietf.org/ipr.

1419	   The IETF invites any interested party to bring to its attention any
1420	   copyrights, patents or patent applications, or other proprietary
1421	   rights that may cover technology that may be required to implement
1422	   this standard.  Please address the information to the IETF at
1423	   ietf-ipr@ietf.org.