idnits 2.17.1 

draft-mrw-nat66-09.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The abstract seems to contain references ([RFC2119]), which it
     shouldn't.  Please replace those with straight textual mentions of the
     documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Line 1280 has weird spacing: '...d short  inner...'

  == Line 1282 has weird spacing: '...d short  outer...'

  == Line 1284 has weird spacing: '...d short  inner...'

  == Line 1285 has weird spacing: '...d short  packe...'

  == Line 1286 has weird spacing: '...ed char   chec...'

  == (12 more instances...)

  -- The document date (March 2, 2011) is 4804 days in the past.  Is this
     intentional?

  -- Found something which looks like a code comment -- if you have code
     sections in the document, please surround them with '<CODE BEGINS>' and
     '<CODE ENDS>' lines.


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

  -- Looks like a reference, but probably isn't: '8' on line 1287

  -- Looks like a reference, but probably isn't: '65536' on line 1286

  -- Looks like a reference, but probably isn't: '3' on line 1514

  -- Looks like a reference, but probably isn't: '1' on line 1514

  -- Looks like a reference, but probably isn't: '0' on line 1514

  -- Looks like a reference, but probably isn't: '2' on line 1514

  -- Looks like a reference, but probably isn't: '4' on line 1515

  -- Looks like a reference, but probably isn't: '5' on line 1515

  -- Looks like a reference, but probably isn't: '6' on line 1515

  -- Looks like a reference, but probably isn't: '7' on line 1515

  -- Obsolete informational reference (is this intentional?): RFC 2629
     (Obsoleted by RFC 7749)

  -- Obsolete informational reference (is this intentional?): RFC 3484
     (Obsoleted by RFC 6724)


     Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 14 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                       M. Wasserman
3	Internet-Draft                                         Painless Security
4	Intended status: Experimental                                   F. Baker
5	Expires: September 3, 2011                                 Cisco Systems
6	                                                           March 2, 2011

8	                IPv6-to-IPv6 Network Prefix Translation
9	                           draft-mrw-nat66-09

11	Abstract

13	   This document describes a stateless, transport-agnostic IPv6-to-IPv6
14	   Network Prefix Translation (NPTv6) function that provides the address
15	   independence benefit associated with IPv4-to-IPv4 NAT (NAPT44), and
16	   in addition provides a 1:1 relationship between addresses in the
17	   "inside" and "outside" prefixes, preserving end to end reachability
18	   at the network layer.

20	Requirements Terminology

22	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
23	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
24	   document are to be interpreted as described in RFC 2119 [RFC2119].

26	Status of this Memo

28	   This Internet-Draft is submitted in full conformance with the
29	   provisions of BCP 78 and BCP 79.

31	   Internet-Drafts are working documents of the Internet Engineering
32	   Task Force (IETF).  Note that other groups may also distribute
33	   working documents as Internet-Drafts.  The list of current Internet-
34	   Drafts is at http://datatracker.ietf.org/drafts/current/.

36	   Internet-Drafts are draft documents valid for a maximum of six months
37	   and may be updated, replaced, or obsoleted by other documents at any
38	   time.  It is inappropriate to use Internet-Drafts as reference
39	   material or to cite them other than as "work in progress."

41	   This Internet-Draft will expire on September 3, 2011.

43	Copyright Notice

45	   Copyright (c) 2011 IETF Trust and the persons identified as the
46	   document authors.  All rights reserved.

48	   This document is subject to BCP 78 and the IETF Trust's Legal
49	   Provisions Relating to IETF Documents
50	   (http://trustee.ietf.org/license-info) in effect on the date of
51	   publication of this document.  Please review these documents
52	   carefully, as they describe your rights and restrictions with respect
53	   to this document.  Code Components extracted from this document must
54	   include Simplified BSD License text as described in Section 4.e of
55	   the Trust Legal Provisions and are provided without warranty as
56	   described in the Simplified BSD License.

58	Table of Contents

60	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
61	     1.1.   What is Address Independence? . . . . . . . . . . . . . .  4
62	     1.2.   NPTv6 Applicability . . . . . . . . . . . . . . . . . . .  6
63	   2.  NPTv6 Overview . . . . . . . . . . . . . . . . . . . . . . . .  7
64	     2.1.   NPTv6: the simplest case  . . . . . . . . . . . . . . . .  7
65	     2.2.   NPTv6 between peer networks . . . . . . . . . . . . . . .  8
66	     2.3.   NPTv6 redundnacy and load-sharing . . . . . . . . . . . .  9
67	     2.4.   NPTv6 multihoming . . . . . . . . . . . . . . . . . . . . 10
68	     2.5.   Mapping with No Per-Flow State  . . . . . . . . . . . . . 10
69	     2.6.   Checksum-Neutral Mapping  . . . . . . . . . . . . . . . . 11
70	   3.  NPTv6 Algorithmic Specification  . . . . . . . . . . . . . . . 11
71	     3.1.   NPTv6 configuration calculations  . . . . . . . . . . . . 11
72	     3.2.   NPTv6 translation, internal network to external
73	            network . . . . . . . . . . . . . . . . . . . . . . . . . 12
74	     3.3.   NPTv6 translation, external network to internal
75	            network . . . . . . . . . . . . . . . . . . . . . . . . . 12
76	     3.4.   NPTv6 with a /48 or shorter prefix  . . . . . . . . . . . 13
77	     3.5.   NPTv6 with a /49 or longer prefix . . . . . . . . . . . . 13
78	     3.6.   /48 Prefix Mapping Example  . . . . . . . . . . . . . . . 13
79	     3.7.   Address Mapping for Longer Prefixes . . . . . . . . . . . 14
80	   4.  Implications of Network Address Translator Behavioral
81	       Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 15
82	     4.1.   Prefix configuration and generation . . . . . . . . . . . 15
83	     4.2.   Subnet numbering  . . . . . . . . . . . . . . . . . . . . 15
84	     4.3.   NAT Behavioral Requirements . . . . . . . . . . . . . . . 15
85	   5.  Implications for Applications  . . . . . . . . . . . . . . . . 16
86	     5.1.   Recommendation for network planners considering use
87	            of NPTv6 Translator . . . . . . . . . . . . . . . . . . . 18
88	     5.2.   Recommendations for application writers . . . . . . . . . 18
89	     5.3.   Recommendation for future work  . . . . . . . . . . . . . 18
90	   6.  A Note on Port Mapping . . . . . . . . . . . . . . . . . . . . 18
91	   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 19
92	   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 20
93	   9.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 20
94	   10. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . 20
95	     10.1.  Changes Between draft-mrw-behave-nat66-00 and -01 . . . . 20
96	     10.2.  Changes between *behave-nat66-01 and -02  . . . . . . . . 20
97	     10.3.  Changes between *nat66-00 and *nat66-01 . . . . . . . . . 21
98	     10.4.  Changes between *nat66-01 and *nat66-02 . . . . . . . . . 21
99	     10.5.  Changes between *nat66-02 and *nat66-03 . . . . . . . . . 22
100	     10.6.  Changes between *nat66-03 and *nat66-04 . . . . . . . . . 22
101	     10.7.  Changes between *nat66-04 and *nat66-05 . . . . . . . . . 22
102	     10.8.  Changes between *nat66-05 and *nat66-06 . . . . . . . . . 22
103	     10.9.  Changes between *nat66-06 and *nat66-07 . . . . . . . . . 22
104	     10.10. Changes between *nat66-07 and *nat66-08 . . . . . . . . . 22
105	   11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23
106	     11.1.  Normative References  . . . . . . . . . . . . . . . . . . 23
107	     11.2.  Informative References  . . . . . . . . . . . . . . . . . 23
108	   Appendix A.  Why GSE?  . . . . . . . . . . . . . . . . . . . . . . 25
109	   Appendix B.  Verification code . . . . . . . . . . . . . . . . . . 27
110	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 34

112	1.  Introduction

114	   This document describes a stateless IPv6-to-IPv6 Network Prefix
115	   Translation (NPTv6) function, designed to provide address
116	   independence to the edge network.  It is transport-agnostic with
117	   respect to transports that don't checksum the IP header, such as SCTP
118	   or DCCP, and to transports that use the TCP/UDP pseudo-header and
119	   checksum [RFC1071].

121	   This has several ramifications:

123	   o  Any security benefit that NAPT44 might offer is not present in
124	      NPTv6, necessitating the use of a firewall to obtain those
125	      benefits if desired.  An example of such a firewall is described
126	      in [RFC6092].

128	   o  End to end reachability is preserved, although the address used
129	      "inside" the edge network differs from the address used "outside"
130	      the edge network.  This has implications for application referrals
131	      and other uses of Internet layer addresses.

133	   o  If there are multiple identically-configured prefix translators
134	      between two networks, there is no need for them to exchange
135	      dynamic state, as there is no dynamic state - the algorithmic
136	      translation will be identical across each of them.  The network
137	      can therefore asymmetrically route, load-share, and fail-over
138	      among them without issue.

140	   o  Since translation is 1:1 at the network layer, there is no need to
141	      modify port numbers or other transport parameters.

143	1.1.  What is Address Independence?

145	   For the purposes of this document, IPv6 Address Independence consists
146	   of the following set of properties:

148	   From the perspective of the edge network:

150	      *  The IPv6 addresses used inside the local network (for
151	         interfaces, access lists, and logs) do not need to be
152	         renumbered if the global prefix(es) assigned for use by the
153	         edge network are changed.

155	      *  The IPv6 addresses used inside the edge network (for
156	         interfaces, access lists, and logs) or within other upstream
157	         networks (such as when multihoming) do not need to be
158	         renumbered when a site adds, drops, or changes upstream
159	         networks.

161	      *  It is not necessary for an administration to convince an
162	         upstream network to route its internal IPv6 prefixes, or for it
163	         to advertise prefixes derived from other upstream networks into
164	         it.

166	      *  Unless it wants to optimize routing between multiple upstream
167	         networks in the process of multihoming, there is therefore no
168	         need for a BGP exchange with the upstream network.

170	   From the perspective of the upstream network:

172	      *  IPv6 addresses used by the edge network are guaranteed to have
173	         a provider-allocated prefix, eliminating the need and concern
174	         for BCP 38 [RFC2827] ingress filtering and the advertisement of
175	         customer-specific prefixes.

177	   Thus, address independence has ramifications for the edge network,
178	   networks it directly connects with (especially its upstream
179	   networks), and for the Internet as a whole.  The desire for address
180	   independence has been a primary driver for IPv4 NAT deployment in
181	   medium to large-sized enterprise networks, including NAT deployments
182	   in enterprises that have plenty of IPv4 provider-independent address
183	   space (from IPv4 "swamp space").  It has also been a driver for edge
184	   networks to become members of Regional Internet Registry (RIR)
185	   communities, seeking to obtain BGP Autonomous System Numbers and
186	   provider-independent prefixes, and as a result has been one of the
187	   drivers of the explosion of the IPv4 route table.  Service providers
188	   have stated that the lack of address independence from their
189	   customers has been a negative incentive to deployment, due to the
190	   impact of customer routing expected in their networks.

192	   The Local Network Protection [RFC4864] document discusses a related
193	   concept called "Address Autonomy" as a benefit of NAPT44.  [RFC4864]
194	   indicates that address autonomy can be achieved by the simultaneous
195	   use of global addresses on all nodes within a site that need external
196	   connectivity, and Unique Local Addresses (ULAs) [RFC4193] for all
197	   internal communication.  However, this solution fails to meet the
198	   requirement for address independence, because if an ISP renumbering
199	   event occurs, all of the hosts, routers, DHCP servers, ACLs,
200	   firewalls and other internal systems that are configured with global
201	   addresses from the ISP will need to be renumbered before global
202	   connectivity is fully restored.

204	   The use of IPv6 Provider Independent (PI) addresses has also been
205	   suggested as a means to fulfill the address independence requirement.
206	   However, this solution requires that an enterprise qualify to receive
207	   a PI assignment and persuade their ISP to install specific routes for
208	   the enterprise's PI addresses.  There are a number of practical
209	   issues with this approach, especially if there is a desire to route
210	   to a number of geographically and topologically diverse set of sites,
211	   which can sometimes involve coordinating with several ISPs to route
212	   portions of a single PI prefix.  These problems have caused numerous
213	   enterprises with plenty of IPv4 swamp space to choose to use IPv4 NAT
214	   for part, or substantially all, of their internal network instead of
215	   using their provider-independent address space.

217	1.2.  NPTv6 Applicability

219	   NPTv6 provides a simple and compelling solution to meet the Address
220	   Independence requirement in IPv6.  The address independence benefit
221	   stems directly from the translation function of the network prefix
222	   translator.  To avoid as many of the issues associated with NAPT44 as
223	   possible, NPTv6 is defined to include a two-way, checksum-neutral,
224	   algorithmic translation function, and nothing else.

226	   The fact that NPTv6 does not map ports and is checksum-neutral avoids
227	   the need for a NPTv6 Translator to re-write transport layer headers.
228	   This makes it feasible to deploy new or improved transport layer
229	   protocols without upgrading NPTv6 Translators.  Similarly, since
230	   NPTv6 does not re-write transport-layer headers, NPTv6 will not
231	   interfere with encryption of the full IP payload in many cases.

233	   The default NPTv6 address mapping mechanism is purely algorithmic, so
234	   NPTv6 translators do not need to maintain per-node or per-connection
235	   state, allowing deployment of more robust and adaptive networks than
236	   can be deployed using NAPT44.  Since the default NPTv6 mapping can be
237	   performed in either direction, it does not interfere with inbound
238	   connection establishment, thus allowing internal nodes to participate
239	   in direct Peer-to-Peer applications without the application layer
240	   overhead one finds in many IPv4 Peer-to-Peer applications.

242	   Although NPTv6 compares favorably to NAPT44 in several ways, it does
243	   not eliminate all of the architectural problems associated with IPv4
244	   NAT, as described in [RFC2993].  NPTv6 involves modifying IP headers
245	   in transit, so it is not compatible with security mechanisms, such as
246	   the IPsec Authentication Header, that provide integrity protection
247	   for the IP header.  NPTv6 may interfere with the use of application
248	   protocols that transmit IP addresses in the application-specific
249	   portion of the IP packet.  These applications currently require
250	   application layer gateways (ALGs) to work correctly through NAPT44
251	   devices, and similar ALGs may be required for these applications to
252	   work through NPTv6 Translators.  The use of separate internal and
253	   external prefixes creates complexity for DNS deployment, due the
254	   desire for internal nodes to communicate with other internal nodes
255	   using internal addresses, while external nodes need to obtain
256	   external addresses to communicate with the same nodes.  This
257	   frequently results in the deployment of "split DNS", which may add
258	   complexity to network configuration.

260	   The choice of address within the edge network bears consideration.
261	   One could use a ULA, which maximizes address independence.  That
262	   could also be considered a misuse of the ULA; if the expectation is
263	   that a ULA prevents access to a system from outside the range of the
264	   ULA, NPTv6 overrides that.  On the other hand, the administration is
265	   aware that it has made that choice, and could if it desired deploy a
266	   second ULA for the purpose of privacy; the only prefix that will be
267	   translated is one that has a NPTv6 Translator configured to translate
268	   to or from it.  Also, using any other global scope address format
269	   makes one either obtain a PI prefix or be at the mercy of the agency
270	   from which it was allocated.

272	   There are significant technical impacts associated with the
273	   deployment of any prefix translation mechanism, including NPTv6, and
274	   we strongly encourage anyone who is considering the implementation or
275	   deployment of NPTv6 to read [RFC4864], and to carefully consider the
276	   alternatives described in that document, some of which may cause
277	   fewer problems than NPTv6.

279	2.  NPTv6 Overview

281	   NPTv6 may be implemented in an IPv6 router to map one IPv6 address
282	   prefix to another IPv6 prefix as each IPv6 packet transits the
283	   router.  A router that implements a NPTv6 prefix translation function
284	   is referred to as an NPTv6 Translator.

286	2.1.  NPTv6: the simplest case

288	   In its simplest form, a NPTv6 Translator interconnects two network
289	   links, one of which is an "internal" network link attached to a leaf
290	   network within a single administrative domain, and the other of which
291	   is an "external" network with connectivity to the global Internet.
292	   All of the hosts on the internal network will use addresses from a
293	   single, locally-routed prefix, and those addresses will be translated
294	   to/from addresses in a globally-routable prefix as IP packets transit
295	   the NPTv6 Translator.  The lengths of these two prefixes will be
296	   functionally the same; if they differ, the longer of the two will
297	   limit the ability to use subnets in the shorter.

299	               External Network:  Prefix = 2001:0DB8:0001:/48
300	                   --------------------------------------
301	                                     |
302	                                     |
303	                              +-------------+
304	                              |     NPTv6   |
305	                              |  Translator |
306	                              +-------------+
307	                                     |
308	                                     |
309	                   --------------------------------------
310	               Internal Network:  Prefix = FD01:0203:0405:/48

312	                       Figure 1: A simple translator

314	   Figure 1 shows a NPTv6 Translator attached to two networks.  In this
315	   example, the internal network uses IPv6 Unique Local Addresses (ULAs)
316	   [RFC4193] to represent the internal IPv6 nodes, and the external
317	   network uses globally routable IPv6 addresses to represent the same
318	   nodes.

320	   When a NPTv6 Translator forwards packets in the "outbound" direction,
321	   from the internal network to the external network, NPTv6 overwrites
322	   the IPv6 source prefix (in the IPv6 header) with a corresponding
323	   external prefix.  When packets are forwarded in the "inbound"
324	   direction, from the external network to the internal network, the
325	   IPv6 destination prefix is overwritten with a corresponding internal
326	   prefix.  Using the prefixes shown in the diagram above, as an IP
327	   packet passes through the NPTv6 Translator in the outbound direction,
328	   the source prefix (FD01:0203:0405:/48) will be overwritten with the
329	   external prefix (2001:0DB8:0001:/48).  In an inbound packet, the
330	   destination prefix (2001:0DB8:0001:/48) will be overwritten with the
331	   internal prefix (FD01:0203:0405:/48).  In both cases, it is the local
332	   IPv6 prefix that is overwritten; the remote IPv6 prefix remains
333	   unchanged.  Nodes on the internal network are said to be "behind" the
334	   NPTv6 Translator.

336	2.2.  NPTv6 between peer networks

338	   NPTv6 can also be used between two private networks.  In these cases,
339	   both networks may use ULA prefixes, with each subnet in one network
340	   mapped into a corresponding subnet in the other network, and vice
341	   versa.  Or, each network may use ULA prefixes for internal
342	   addressing, and global unicast addresses on the other network.

344	                  Internal Prefix = FD01:4444:5555:/48
345	                  --------------------------------------
346	                       V            |      External Prefix
347	                       V            |      2001:0DB8:6666:/48
348	                       V        +---------+      ^
349	                       V        |  NPTv6  |      ^
350	                       V        |  Device |      ^
351	                       V        +---------+      ^
352	              External Prefix       |            ^
353	              2001:0DB8:0001:/48    |            ^
354	                  --------------------------------------
355	                  Internal Prefix = FD01:0203:0405:/48

357	               Figure 2: Flow of Information in Translation

359	2.3.  NPTv6 redundnacy and load-sharing

361	   In some cases, more than one NPTv6 Translator may be attached to a
362	   network, as show in Figure 3.  In such cases, NPTv6 Translators are
363	   configured with the same internal and external prefixes.  Since there
364	   is only one translation, even though there are multiple translators,
365	   they map only one external address (prefix and IID) to the internal
366	   address.

368	               External Network:  Prefix = 2001:0DB8:0001:/48
369	                   --------------------------------------
370	                          |                      |
371	                          |                      |
372	                   +-------------+        +-------------+
373	                   |  NPTv6      |        |  NPTv6      |
374	                   |  Translator |        |  Translator |
375	                   |   #1        |        |   #2        |
376	                   +-------------+        +-------------+
377	                          |                      |
378	                          |                      |
379	                   --------------------------------------
380	               Internal Network:  Prefix = FD01:0203:0405:/48

382	                      Figure 3: Parallel Translators

384	2.4.  NPTv6 multihoming

386	            External Network #1:          External Network #2:
387	         Prefix = 2001:0DB8:0001:/48    Prefix = 2001:0DB8:5555:/48
388	         ---------------------------    --------------------------
389	                         |                      |
390	                         |                      |
391	                  +-------------+        +-------------+
392	                  |  NPTv6      |        |  NPTv6      |
393	                  |  Translator |        |  Translator |
394	                  |   #1        |        |   #2        |
395	                  +-------------+        +-------------+
396	                         |                      |
397	                         |                      |
398	                  --------------------------------------
399	              Internal Network:  Prefix = FD01:0203:0405:/48

401	      Figure 4: Parallel Translators with different upstream networks

403	   When multihoming, NPTv6 Translators are attached to an internal
404	   network, as show in Figure 4, but connected to different external
405	   networks.  In such cases, NPTv6 Translators are configured with the
406	   same internal prefix, but different external prefixes.  Since there
407	   are multiple translations, they map multiple external addresses
408	   (prefix and IID) to the common internal address.  A system within the
409	   edge network is unable to determine which external address it is
410	   using apart from services such as STUN.

412	   Multihoming in this sense has one negative feature as compared with
413	   multihoming with a provider-independent address; when routes change
414	   between NPTv6 Translators, since the upstream network changes, the
415	   translated prefix can change.  This would case sessions and referrals
416	   dependent on it to fail as well.  This is not expected to be a major
417	   real issue, however, in networks where routing is generally stable.

419	2.5.  Mapping with No Per-Flow State

421	   When NPTv6 is used as described in this document, no per-node or per-
422	   flow state is maintained in the NPTv6 Translator.  Both inbound and
423	   outbound packets are translated algorithmically, using only
424	   information found in the IPv6 header.  Due to this property, NPTv6's
425	   two-way, algorithmic address mapping can support both outbound and
426	   inbound connection establishment without the need for state-priming
427	   or rendezvous mechanisms, or the maintenance of mapping state.  This
428	   is a significant improvement over NAPT44 devices, but it also has
429	   significant security implications which are described in the Security
430	   Considerations section.

432	2.6.  Checksum-Neutral Mapping

434	   When a change is made to one of the IP header fields in the IPv6
435	   pseudo-header checksum (such as one of the IP addresses), the
436	   checksum field in the transport layer header may become invalid.
437	   Fortunately, an incremental change in the area covered by the
438	   Internet standard checksum [RFC1071] will result in a well-defined
439	   change to the checksum value [RFC1624].  So, a checksum change caused
440	   by modifying part of the area covered by the checksum can be
441	   corrected by making a complementary change to a different 16-bit
442	   field covered by the same checksum.

444	   The NPTv6 mapping mechanisms described in this document are checksum-
445	   neutral, which means that they result in IP headers that will
446	   generate the same IPv6 pseudo-header checksum when the checksum is
447	   calculated using the standard Internet checksum algorithm [RFC1071].
448	   Any changes that are made during translation of the IPv6 prefix are
449	   offset by changes to other parts of the IPv6 address.  This results
450	   in transport layers that use the Internet checksum (such as TCP and
451	   UDP) calculating the same IPv6 pseudo header checksum for both the
452	   internal and external forms of the same packet, which avoids the need
453	   for the NPTv6 Translator to modify those transport layer headers to
454	   correct the checksum value.

456	   As noted in Section 4.2, this mapping results in an edge network
457	   using a /48 external prefix to be unable to use subnet 0xFFFF.

459	3.  NPTv6 Algorithmic Specification

461	   The [RFC4291] IPv6 Address is reproduced for clarity in Figure 5.

463	      0    15 16   31 32   47 48   63 64   79 80   95 96  111 112  127
464	     +-------+-------+-------+-------+-------+-------+-------+-------+
465	     |     Routing Prefix    | Subnet|   Interface Identifier (IID)  |
466	     +-------+-------+-------+-------+-------+-------+-------+-------+

468	            Figure 5: Enumeration of the IPv6 Address [RFC4291]

470	3.1.  NPTv6 configuration calculations

472	   When an NPTv6 Translation function is configured, it is configured
473	   with

475	   o  one or more "internal" interfaces with their "internal" routing
476	      domain prefixes, and

478	   o  one or more "external" interfaces with their "external" routing
479	      domain prefixes.

481	   In the simple case, there is one of each.  If a single router
482	   provides NPTv6 translation services between a multiplicity of domains
483	   (as might be true when multihoming), each internal/external pair must
484	   be thought of as a separate NPTv6 Translator from the perspective of
485	   this specification.

487	   When an NPTv6 Translator is configured, the translation function
488	   first ensures that the internal and external prefixes are the same
489	   length, if necessary by extending the shorter of the two with zeroes.
490	   These two prefixes will be used in the prefix translation function
491	   described in Section 3.2 and Section 3.3.

493	   They are then zero-extended to /64, for the purposes of a
494	   calculation.  The translation function calculates the ones-complement
495	   sum of the 16 bit words of the /64 external prefix and the /64
496	   internal prefix.  It then calculates the difference between these
497	   values: internal minus external.  This value, called the
498	   "adjustment", is effectively constant for the lifetime of the NPTv6
499	   Translator configuration, and used in per-packet processing.

501	3.2.  NPTv6 translation, internal network to external network

503	   When a datagram passes through the NPTv6 Translator from an internal
504	   to an external network, its IPv6 Source Address is changed in two
505	   ways:

507	   o  If the internal subnet number has no mapping, such as being 0xFFFF
508	      or simply not mapped, discard the datagram.  This SHOULD result in
509	      an ICMP Destination Unreachable.

511	   o  The internal prefix is overwritten with the external prefix, in
512	      effect subtracting the difference between the two checksums (the
513	      adjustment) from the pseudo-header's checksum, and

515	   o  A 16-bit word of the address has the adjustment added to it using
516	      one's complement arithmetic.  If the result is 0xFFFF, it is
517	      overwritten as zero.  The choice of word is as specified in
518	      Section 3.4 or Section 3.5 as appropriate.

520	3.3.  NPTv6 translation, external network to internal network

522	   When a datagram passes through the NPTv6 Translator from an external
523	   to an internal network, its IPv6 Destination Address is changed in
524	   two ways:

526	   o  The external prefix is overwritten with the internal prefix, in
527	      effect adding the difference between the two checksums (the
528	      adjustment) to the pseudoheader's checksum, and

530	   o  A 16-bit word of the address has the adjustment subtracted from it
531	      (bitwise inverted and added to it) it using one's complement
532	      arithmetic.  If the result is 0xFFFF, it is overwritten as zero.
533	      The choice of word is as specified in Section 3.4 or Section 3.5
534	      as appropriate.

536	3.4.  NPTv6 with a /48 or shorter prefix

538	   When a NPTv6 Translator is configured with internal and external
539	   prefixes that are 48 bits in length (a /48) or shorter, the
540	   adjustment MUST be added to or subtracted from bits 48..63 of the
541	   address.

543	   This mapping results in no modification of the Interface Identifier
544	   (IID), which is held in the lower half of the IPv6 address, so it
545	   will not interfere with future protocols that may use unique IIDs for
546	   node identification.

548	   NPTv6 Translator implementations MUST implement the /48 mapping.

550	3.5.  NPTv6 with a /49 or longer prefix

552	   When a NPTv6 Translator is configured with internal and external
553	   prefixes that are longer than 48 bits in length (such as a /52, /56,
554	   or /60), the adjustment must be added to or subtracted from one of
555	   the words in bits 64..79, 80..95, 96..111, or 112..127 of the
556	   address.  While the choice of word is immaterial as long as it is
557	   consistent, for consistency's sake, these words MUST be inspected in
558	   that sequence, and the first that is not initially 0xFFFF chosen.

560	   NPTv6 Translator implementations SHOULD implement the mapping for
561	   longer prefixes.

563	3.6.  /48 Prefix Mapping Example

565	   For the network shown in Figure 1, the Internal Prefix is FD01:0203:
566	   0405:/48, and the External Prefix is 2001:0DB8:0001:/48

568	   If a node with internal address FD01:0203:0405:0001::1234 sends an
569	   outbound packet through the NPTv6 Translator, the resulting external
570	   address will be 2001:0DB8:0001:D550::1234.  The resulting address is
571	   obtained by calculating the checksum of both the internal and
572	   external 48-bit prefixes, subtracting the internal prefix from the
573	   external prefix using one's complement arithmetic to calculate the
574	   "adjustment", and adding the adjustment to the 16-bit subnet field
575	   (in this case 0x0001).

577	   To show the work:

579	   The one's complement checksum of FD01:0203:0405 is 0xFCF5.  The one's
580	   complement checksum of 2001:0DB8:0001 is 0xD245.  Using one's
581	   complement arithmetic, 0xD245 - 0xFCF5 = 0xD54F. The subnet in the
582	   original packet is 0x0001.  Using one's complement arithmetic, 0x0001
583	   + 0xD54F = 0xD550.  Since 0xD550 != 0xFFFF, it is not changed to
584	   0x0000.

586	   So, the value 0xD550 is written in the 16-bit subnet area, resulting
587	   in a mapped external address of 2001:0DB8:0001:D550::1234.

589	   When a response packet is received, it will contain the destination
590	   address 2001:0DB8:0001:D550::0001, which will be mapped using the
591	   inverse mapping algorithm, back to FD01:0203:0405:0001::1234.

593	   In this case, the difference between the two prefixes will be
594	   calculated as follows:

596	   Using one's complement arithmetic, 0xFCF5 - 0xD245 = 0x2AB0.  The
597	   subnet in the original packet = 0xD550.  Using one's complement
598	   arithmetic, 0xD550 + 0x2AB0 = 0x0001.  Since 0x0001 != 0xFFFF, it is
599	   not changed to 0x0000.

601	   So the value 0x0001 is written into the subnet field, and the
602	   internal value of the subnet field is properly restored.

604	3.7.  Address Mapping for Longer Prefixes

606	   If the prefix being mapped is longer than 48 bits, the algorithm is
607	   slightly more complex.  A common case will be that the internal and
608	   external prefixes are of different length.  In such a case, the
609	   shorter prefix is zero-extended to the length of the longer as
610	   described in Section 3.1 for the purposes of overwriting the prefix.
611	   Then, they are both zero-extended to 64 bits to facilitate one's
612	   complement arithmetic.  The "adjustment" is calculated using those 64
613	   bit prefixes.

615	   For example if the internal prefix is a /48 ULA and the external
616	   prefix is a /56 provider-allocated prefix, the ULA becomes a /56 with
617	   zeros in bits 48..55.  For purposes of one's complement arithmetic,
618	   they are then both zero-extended to 64 bits.  A side-effect of this
619	   is that a subset of the subnets possible in the shorter prefix are
620	   untranslatable.  While the security value of this is debatable, the
621	   administration may choose to use them for subnets that it knows need
622	   no external accessibility.

624	   We then find the first word in the IID that does not have the value
625	   0xFFFF, trying bits 64..79, and then 80..95, 96..111, and finally
626	   112..127.  We perform the same calculation (with the same proof of
627	   correctness) as in Section 3.6, but applying it to that word.

629	   Although any 16-bit portion of an IPv6 IID could contain 0xFFFF, an
630	   IID of all-ones is a reserved anycast identifier that should not be
631	   used on the network [RFC2526].  If a NPTv6 Translator discovers a
632	   packet with an IID of all-zeros while performing address mapping,
633	   that packet MUST be dropped, and an ICMPv6 Parameter Problem error
634	   SHOULD be generated [RFC4443].

636	   Note: this mechanism does involve modification of the IID; it may not
637	   be compatible with future mechanisms that use unique IIDs for node
638	   identification.

640	4.  Implications of Network Address Translator Behavioral Requirements

642	4.1.  Prefix configuration and generation

644	   NPTv6 Translators MUST support manual configuration of internal and
645	   external prefixes, and MUST NOT place any restrictions on those
646	   prefixes except that they be valid IPv6 unicast prefixes as described
647	   in [RFC4291].  They MAY also support random generation of ULA
648	   addresses on command.  Since the most common place anticipated for
649	   the implementation of an NPTv6 Translator is a CPE router, the reader
650	   is urged to consider the requirements of
651	   [I-D.ietf-v6ops-ipv6-cpe-router].

653	4.2.  Subnet numbering

655	   For reasons detailed in Appendix B, a network using NPTv6 Translation
656	   and a /48 external prefix MUST NOT use the value 0xFFFF to designate
657	   a subnet that it expects to be translated.

659	4.3.  NAT Behavioral Requirements

661	   NPTv6 Translators MUST support hairpinning behavior, as defined in
662	   the NAT Behavioral Requirements for UDP document [RFC4787].  This
663	   means that when a NPTv6 Translator receives a packet on the internal
664	   interface that has a destination address that matches the site's
665	   external prefix, it will translate the packet and forward it
666	   internally.  This allows internal nodes to reach other internal nodes
667	   using their external, global addresses when necessary.

669	   Conceptually, the datagram leaves the domain (is translated as
670	   described in Section 3.2), and returns (is again translated as
671	   described in Section 3.3).  As a result, the datagram exchange will
672	   be through the NPTv6 Translator in both directions for the lifetime
673	   of the session.  The alternative would be to require the NPTv6
674	   Translator to drop the datagram, forcing the sender to use the
675	   correct internal prefix for its peer.  Performing only the external-
676	   to-internal translation results in the datagram being sent from the
677	   untranslated internal address of the source to the translated and
678	   therefore internal address of its peer, which would enable the
679	   session to bypass the NPTv6 Translator for future datagrams.  It
680	   would also mean that the original sender would be unlikely to
681	   recognize the response when it arrived.

683	   Because NPTv6 does not perform port mapping and uses a one-to-one,
684	   reversible mapping algorithm, none of the other NAT behavioral
685	   requirements apply to NPTv6.

687	5.  Implications for Applications

689	   NPTv6 Translation does not create several of the problems known to
690	   exist with other kinds of NATs and discussed in [RFC2993].  In
691	   particular: NPTv6 Translation is stateless, so a "reset" or brief
692	   outage of an NPTv6 Translator does not break connections that
693	   traverse the translation function, and if multiple NPTv6 Translators
694	   exist between the same two networks, load can shift or be dynamically
695	   loaded-shared among them.  Also, an NPTv6 Translator does not
696	   aggregate traffic for several hosts/interfaces behind a lesser number
697	   of external addresses, so there is no inherent expectation for an
698	   NPTv6 Translator to block new inbound flows from external hosts, and
699	   no issue with a filter or blacklist associated with one prefix within
700	   the domain affecting another.  A firewall can of course be used in
701	   conjunction with NPTv6 Translator; this would allow the network
702	   administrator more flexibility to specify security policy than would
703	   be possible with a traditional NAT.

705	   However, NPTv6 Translation does create difficulties for some kinds of
706	   applications. e.g.:

708	   o  An application instance "behind" an NPTv6 Translator will see a
709	      different address for its connections than its peers "outside" the
710	      NPTv6 Translator.

712	   o  An application instance "outside" an NPTv6 Translator will see a
713	      different address for its connections than any peers which are
714	      "behind" an NPTv6 Translator.

716	   o  An application instance wishing to establish communication with a
717	      peer "behind" an NPTv6 Translator may need to use a different
718	      address to reach that peer depending on whether the instance is
719	      behind the same NPTv6 Translator or external to it.  If the NPTv6
720	      Translator implements hairpinning (Section 4.3), it suffices for
721	      applications to always use their external addresses.  However,
722	      this creates inefficiencies in the local network and may also
723	      complicate implementation of the NPTv6 Translator.  [RFC3484] also
724	      would prefer the private address in such a case in order to reduce
725	      those inefficiencies.

727	   o  An application instance which moves from a realm "behind" an NPTv6
728	      Translator to a realm that is "outside" the network, or vice
729	      versa, may find that it is no longer able to reach its peers at
730	      the same addresses it was previously able to use.

732	   o  An application instance which is intermittently communicating with
733	      a peer that moves from behind an NPTv6 Translator, to "outside"
734	      the of it, or vice versa, may find that it is no longer able to
735	      reach that peer at the same address that it had previously used.

737	   Many, but not all, of the applications which are adversely affected
738	   by NPTv6 Translation are those that do "referrals" - where an
739	   application instance passes its own addresses, and/or addresses of
740	   its peers, to other peers.  (Some believe referrals are inherently
741	   undesirable; others believe that they are necessary in some
742	   circumstances.  A discussion of the merits of referrals, or lack
743	   thereof, is beyond the scope of this document.)

745	   To some extent, the incidence of these difficulties can be reduced by
746	   DNS hacks that attempt to expose addresses "behind" an NPTv6
747	   Translator only to hosts which are also behind the same NPTv6
748	   Translator; and perhaps also, to expose only the "internal" addresses
749	   of hosts behind the NPTv6 Translator to other hosts behind the same
750	   NPTv6 Translator.  However, this cannot be a complete solution.  A
751	   full discussion of these issues is out of scope for this document,
752	   but briefly: (a) reliance on DNS to solve this problem depends on
753	   hosts always making queries from DNS servers in the same realm as
754	   they are (or on DNS interception proxies, which create their own
755	   problems), and on mobile hosts/applications not caching those
756	   results; (b) reliance on DNS to solve this problem depends on network
757	   administrators on all networks using such applications to reliably
758	   and accurately maintain current DNS entries for every host using
759	   those applications; and (c) reliance on DNS to solve this problem
760	   depends on applications always using DNS names, even though they
761	   often must run in environments where DNS names are not reliably
762	   maintained for every host.  Other issues are that there is often no
763	   single distinguished name for a host, no reliable way for a host to
764	   determine what DNS names are associated with it, and which names are
765	   appropriate to use in which contexts.

767	5.1.  Recommendation for network planners considering use of NPTv6
768	      Translator

770	   In light of the above, network planners considering the use of NPTv6
771	   translation should carefully consider the kinds of applications that
772	   they will need to run in the future, and determine whether the
773	   address stability and provider independence benefits are consistent
774	   with their application requirements.

776	5.2.  Recommendations for application writers

778	   Several mechanisms (e.g.  STUN, TURN, ICE) have been used with
779	   traditional IPv4 NAT to circumvent some of the limitations of such
780	   devices.  Similar mechanisms could also be applied to circumvent some
781	   of the issues with NPTv6 Translator.  However, all of these require
782	   the assistance of an external server or a function co-located with
783	   the translator that can tell an "internal" host what its "external"
784	   addresses are.

786	5.3.  Recommendation for future work

788	   It might be desirable to define a general mechanism which would allow
789	   hosts within a translation domain to determine their external
790	   addresses and/or request that inbound traffic be permitted.  If such
791	   a mechanism were to be defined, it would ideally be general enough to
792	   also accommodate other types of NAT likely to be encountered by IPV6
793	   applications - in particular, IPv4/IPv6 Translation
794	   [I-D.ietf-behave-v6v4-framework] [I-D.ietf-behave-dns64]
795	   [I-D.ietf-behave-v6v4-xlate] [I-D.ietf-behave-v6v4-xlate-stateful]
796	   [RFC6052].  For this and other reasons, such a mechanism is beyond
797	   the scope of this document.

799	6.  A Note on Port Mapping

801	   In addition to overwriting IP addresses when packets are forwarded,
802	   NAPT44 devices overwrite the source port number in outbound traffic,
803	   and the destination port number in inbound traffic.  This mechanism
804	   is called "port mapping".

806	   The major benefit of port mapping is that it allows multiple
807	   computers to share a single IPv4 address.  A large number of internal
808	   IPv4 addresses (typically from one of the [RFC1918] private address
809	   spaces) can be mapped into a single external, globally routable IPv4
810	   address, with the local port number used to identify which internal
811	   node should receive each inbound packet.  This address amplification
812	   feature is not generally foreseen as a necessity at this time.

814	   Since port mapping requires re-writing a portion of the transport
815	   layer header, it requires NAPT44 devices to be aware of all of the
816	   transport protocols that they forward, thus stifling the development
817	   of new and improved transport protocols and preventing the use of
818	   IPsec encryption.  Modifying the transport layer header is
819	   incompatible with security mechanisms that encrypt the full IP
820	   payload, and restricts the NAPT44 to forwarding transport layers that
821	   use weak checksum algorithms that are easily recalculated in routers.

823	   Since there is significant detriment caused by modifying transport
824	   layer headers and very little, if any, benefit to the use of port
825	   mapping in IPv6, NPTv6 Translators that comply with this
826	   specification MUST NOT perform port mapping.

828	7.  Security Considerations

830	   When NPTv6 is deployed using either of the two-way, algorithmic
831	   mappings defined in the document, it allows direct inbound
832	   connections to internal nodes.  While this can be viewed as a benefit
833	   of NPTv6 vs. NAPT44, it does open internal nodes to attacks that
834	   would be more difficult in a NAPT44 network.  Although this situation
835	   is not substantially worse, from a security standpoint, than running
836	   IPv6 with no NAT, some enterprises may assume that a NPTv6 Translator
837	   will offer similar protection to a NAPT44 device.

839	   The port mapping mechanism in NAPT44 implementations require that
840	   state be created in both directions.  This has lead to an industry-
841	   wide perception that NAT functionality is the same as a stateful
842	   firewall.  It is not.  The translation function of the NAT only
843	   creates dynamic state in one direction and has no policy.  For this
844	   reason, it is RECOMMENDED that NPTv6 Translators also implement
845	   firewall functionality such as described in [RFC6092], with
846	   appropriate configuration options including turning it on or off.

848	   When [RFC4864] talks about randomizing the subnet identifier, the
849	   idea is to make it harder for worms to guess a valid subnet
850	   identifier at an advertised network prefix.  This should not be
851	   interpreted as endorsing concealing the subnet identifier behind the
852	   obfuscating function of a translator such as NPTv6.  [RFC4864]
853	   specifically talks about how to obtain the desired properties of
854	   concealment without using a translator.  Topology hiding when using
855	   NAT is often ineffective in environments where the topology is
856	   visible in application layer messaging protocols such as DNS, SIP,
857	   SMTP, etc.  If the information were not available through the
858	   application layer, [RFC2993] would not be valid.

860	8.  IANA Considerations

862	   This document has no IANA considerations.

864	9.  Acknowledgements

866	   The checksum-neutral algorithmic address mapping described in this
867	   document is based on e-mail written by Iljtsch Van Beijnum.

869	   The following people provided advice or review comments that
870	   substantially improved this document: Christian Huitema, Dave Thaler,
871	   Ed Jankiewicz, Eric Kline, Iljtsch Van Beijnum, Jari Arkko, Keith
872	   Moore, Mark Townsley, Merike Kaeo, Ralph Droms, Remi Depres, Steve
873	   Blake, and Tony Hain.

875	   This document was written using the xml2rfc tool described in RFC
876	   2629 [RFC2629].

878	10.  Change Log

880	   This section should be removed by the RFC Editor.

882	10.1.  Changes Between draft-mrw-behave-nat66-00 and -01

884	   There were several minor changes made between the *behave-nat66-00
885	   and -01 versions of this draft:

887	   o  Added Fred Baker as a co-author.

889	   o  Minor arithmetic corrections.

891	   o  Added AH to paragraph on NAT security issues.

893	   o  Added additional NAT topologies to overview (diagrams TBD).

895	10.2.  Changes between *behave-nat66-01 and -02

897	   There were further changes made between *behave-nat66-01 and -02:

899	   o  Removed topology hiding mechanism.

901	   o  Added diagrams.

903	   o  Made minor updates based on mailing list feedback.

905	   o  Added discussion of IPv6 SAF document.

907	   o  Added applicability section.

909	   o  Added discussion of Address Independence requirement.

911	   o  Added hairpinning requirement and discussion of applicability of
912	      other NAT behavioral requirements.

914	10.3.  Changes between *nat66-00 and *nat66-01

916	   There were further changes made between nat66-01 and nat66-02:

918	   o  Added mapping for prefixes longer than /48.

920	   o  Change draft name to remove reference to the behave WG.

922	   o  Resolved various open issues and fixed typos.

924	10.4.  Changes between *nat66-01 and *nat66-02

926	   o  Change the acronym "NAT66" to "NPTv6", so people don't read "NAT"
927	      and MEGO.

929	   o  Change the term used to refer to the function from "NAT66 device"
930	      to "NPTv6 Translator".  It's not a "device" function, it's a
931	      function that is applied between two interfaces.  Consider a
932	      router with two upstreams and two legs in the local network; it
933	      will not translate between the local legs, but will translate to
934	      and from each upstream, and be configured differently for each of
935	      the two ISPs.

937	   o  Comment specifically on the security aspects.

939	   o  Comment specifically on the application issues raised on this
940	      list.

942	   o  Comment specifically on multihoming, load-sharing, and asymmetric
943	      routing.

945	   o  Spell out the hairpinning requirement and its implications.

947	   o  Spell out the service provider side of Address Independence.

949	   o  00 focuses on the edge's view
950	   o  Detail the algorithm in a manner clearer to the implementor (I
951	      think)

953	   o  Spell out the case for GSE-style DMZs between the edge and the
954	      transit network, which is about the implications for the global
955	      routing table.

957	   o  Refer to [RFC6092] as a CPE firewall description.

959	10.5.  Changes between *nat66-02 and *nat66-03

961	   o  Added an appendix on Verification code

963	   o  Various minor markups in response to Ralph Droms

965	10.6.  Changes between *nat66-03 and *nat66-04

967	   o  Markups in response to Christian Huitema, mostly surrounding the
968	      issue of subnet 0xFFFF.

970	   o  Refer to [I-D.ietf-v6ops-ipv6-cpe-router] for CPE router
971	      requirements.

973	10.7.  Changes between *nat66-04 and *nat66-05

975	   o  Update statistics in appendix A per BGP report of 17 December 2010

977	   o  Update security considerations using text supplied by Merike Kaeo.

979	10.8.  Changes between *nat66-05 and *nat66-06

981	   o  restore a code snippet inadvertently removed in version -05

983	10.9.  Changes between *nat66-06 and *nat66-07

985	   o  Changed requested status to experimental

987	   o  Incorporated comments from Eric Kline

989	10.10.  Changes between *nat66-07 and *nat66-08

991	   The section on Application Considerations was expanded after
992	   discussion with Keith Moore.

994	11.  References
995	11.1.  Normative References

997	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
998	              Requirement Levels", BCP 14, RFC 2119, March 1997.

1000	   [RFC2526]  Johnson, D. and S. Deering, "Reserved IPv6 Subnet Anycast
1001	              Addresses", RFC 2526, March 1999.

1003	   [RFC4193]  Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast
1004	              Addresses", RFC 4193, October 2005.

1006	   [RFC4291]  Hinden, R. and S. Deering, "IP Version 6 Addressing
1007	              Architecture", RFC 4291, February 2006.

1009	   [RFC4443]  Conta, A., Deering, S., and M. Gupta, "Internet Control
1010	              Message Protocol (ICMPv6) for the Internet Protocol
1011	              Version 6 (IPv6) Specification", RFC 4443, March 2006.

1013	   [RFC4787]  Audet, F. and C. Jennings, "Network Address Translation
1014	              (NAT) Behavioral Requirements for Unicast UDP", BCP 127,
1015	              RFC 4787, January 2007.

1017	11.2.  Informative References

1019	   [GSE]      O'Dell, M., "GSE - An Alternate Addressing Architecture
1020	              for IPv6", February 1997,
1021	              <http://tools.ietf.org/id/draft-ietf-ipngwg-gseaddr>.

1023	   [I-D.ietf-behave-dns64]
1024	              Bagnulo, M., Sullivan, A., Matthews, P., and I. Beijnum,
1025	              "DNS64: DNS extensions for Network Address Translation
1026	              from IPv6 Clients to IPv4 Servers",
1027	              draft-ietf-behave-dns64-11 (work in progress),
1028	              October 2010.

1030	   [I-D.ietf-behave-v6v4-framework]
1031	              Baker, F., Li, X., Bao, C., and K. Yin, "Framework for
1032	              IPv4/IPv6 Translation",
1033	              draft-ietf-behave-v6v4-framework-10 (work in progress),
1034	              August 2010.

1036	   [I-D.ietf-behave-v6v4-xlate]
1037	              Li, X., Bao, C., and F. Baker, "IP/ICMP Translation
1038	              Algorithm", draft-ietf-behave-v6v4-xlate-23 (work in
1039	              progress), September 2010.

1041	   [I-D.ietf-behave-v6v4-xlate-stateful]
1042	              Bagnulo, M., Matthews, P., and I. Beijnum, "Stateful
1043	              NAT64: Network Address and Protocol Translation from IPv6
1044	              Clients to IPv4 Servers",
1045	              draft-ietf-behave-v6v4-xlate-stateful-12 (work in
1046	              progress), July 2010.

1048	   [I-D.ietf-v6ops-ipv6-cpe-router]
1049	              Singh, H., Beebee, W., Donley, C., Stark, B., and O.
1050	              Troan, "Basic Requirements for IPv6 Customer Edge
1051	              Routers", draft-ietf-v6ops-ipv6-cpe-router-09 (work in
1052	              progress), December 2010.

1054	   [RFC1071]  Braden, R., Borman, D., Partridge, C., and W. Plummer,
1055	              "Computing the Internet checksum", RFC 1071,
1056	              September 1988.

1058	   [RFC1624]  Rijsinghani, A., "Computation of the Internet Checksum via
1059	              Incremental Update", RFC 1624, May 1994.

1061	   [RFC1918]  Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G., and
1062	              E. Lear, "Address Allocation for Private Internets",
1063	              BCP 5, RFC 1918, February 1996.

1065	   [RFC2629]  Rose, M., "Writing I-Ds and RFCs using XML", RFC 2629,
1066	              June 1999.

1068	   [RFC2827]  Ferguson, P. and D. Senie, "Network Ingress Filtering:
1069	              Defeating Denial of Service Attacks which employ IP Source
1070	              Address Spoofing", BCP 38, RFC 2827, May 2000.

1072	   [RFC2993]  Hain, T., "Architectural Implications of NAT", RFC 2993,
1073	              November 2000.

1075	   [RFC3484]  Draves, R., "Default Address Selection for Internet
1076	              Protocol version 6 (IPv6)", RFC 3484, February 2003.

1078	   [RFC4864]  Van de Velde, G., Hain, T., Droms, R., Carpenter, B., and
1079	              E. Klein, "Local Network Protection for IPv6", RFC 4864,
1080	              May 2007.

1082	   [RFC6052]  Bao, C., Huitema, C., Bagnulo, M., Boucadair, M., and X.
1083	              Li, "IPv6 Addressing of IPv4/IPv6 Translators", RFC 6052,
1084	              October 2010.

1086	   [RFC6092]  Woodyatt, J., "Recommended Simple Security Capabilities in
1087	              Customer Premises Equipment (CPE) for Providing
1088	              Residential IPv6 Internet Service", RFC 6092,
1089	              January 2011.

1091	Appendix A.  Why GSE?

1093	   For the purpose of this discussion, let us over-simplify the
1094	   Internet's structure by distinguishing between two broad classes of
1095	   networks: transit and edge.  A "transit network", in this context, is
1096	   a network that provides connectivity services to other networks.  Its
1097	   AS number may show up in a non-final position in BGP AS paths, or in
1098	   the case of mobile and residential broadband networks, it may offer
1099	   network services to smaller networks that can't justify RIR
1100	   membership.  An "edge network", in contrast, is any network that is
1101	   not a transit network; it is the ultimate customer, and while it
1102	   provides internal connectivity for its own use, it is in other
1103	   respects is a consumer of transit services.  In terms of routing, a
1104	   network in the transit domain generally needs some way to make
1105	   choices about how it routes to other networks; an edge network is
1106	   generally quite satisfied with a simple default route.

1108	   The [GSE] proposal, and as a result this proposal (which is similar
1109	   to GSE in most respects and inspired by it), responds directly to
1110	   current concerns in the RIR communities.  Edge networks are used to
1111	   an environment in IPv4 in which their addressing is disjoint from
1112	   that of their upstream transit networks; it is either provider
1113	   independent, or a network prefix translator makes their external
1114	   address distinct from their internal address, and they like the
1115	   distinction.  In IPv6, there is a mantra that edge network addresses
1116	   should be derived from their upstream, and if they have multiple
1117	   upstreams, edge networks are expected to design their networks to use
1118	   all of those prefixes equivalently.  They see this as unnecessary and
1119	   unwanted operational complexity, and are as a result pushing very
1120	   hard in the RIR communities for provider independent addressing.

1122	   Widespread use of provider independent addressing has a natural and
1123	   perhaps unavoidable side-effect that is likely to be very expensive
1124	   in the long term.  It means that the routing table will enumerate the
1125	   networks at the edge of the transit domain, the edge networks, rather
1126	   than enumerating the transit domain.  Per the BGP Update Report of 17
1127	   December 2010, there are currently over 36,000 Autonomous Systems
1128	   being advertised in BGP, of which over 15,000 advertise only one
1129	   prefix.  There are in the neighborhood of 5000 AS's that show up in a
1130	   non-final position in AS paths, and perhaps another 5000 networks
1131	   whose AS numbers are terminal in more than one AS path.  In other
1132	   words, we have prefixes for some 36,000 transit and edge networks in
1133	   the route table now, many of which arguably need an Autonomous System
1134	   number only for multihoming.  Current estimates suggest that we could
1135	   easily see that be on the order of 10,000,000 within fifteen years.
1136	   Tens of thousands of entries in the 36,264 Autonomous Systems being
1137	   advertised in BGP, of which 31,137 provide no visible transit service
1138	   to another AS, and 23,595 of those are visible in only one AS path
1139	   (have only one upstream network).  In addition, of the 36,264 AS's in
1140	   the world, 15,439 advertise only a single prefix.  In other words, we
1141	   have prefixes for some 36,000 transit and edge networks in the route
1142	   table now, many of which arguably need an Autonomous System number
1143	   only for multihoming.  However, the vast majority of networks (2/3)
1144	   having the tools necessary to multihome are not visibly doing so, and
1145	   would be well served by any solution that gives them address
1146	   independence without the overhead of RIR membership and BGP routing.

1148	   Current growth estimates suggest that we could easily see that be on
1149	   the order of 10,000,000 within fifteen years.  Tens of thousands of
1150	   entries in the route table is very survivable; while our protocols
1151	   and computers will likely do quite well with tens of millions of
1152	   routes, the heat produced and power consumed by those routers, and
1153	   the inevitable impact on the cost of those routers, is not a good
1154	   outcome.  To avoid having a massive and unscalable route table, we
1155	   need to find a way that is politically acceptable and returns us to
1156	   enumerating the transit domain, not the edge.

1158	   There have been a number of proposals.  As described, shim6 moves the
1159	   complexity to the edge, and the edge is rebelling.  Geographic
1160	   addressing in essence forces ISPs to "own" geographic territory from
1161	   a routing perspective, as otherwise there is no clue in the address
1162	   as to what network a datagram should be delivered to in order to
1163	   reach it.  Metropolitan Addressing can imply regulatory authority,
1164	   and even if it is implemented using internet exchange consortia,
1165	   visits a great deal of complexity on the transit networks that
1166	   directly serve the edge.  The one that is likely to be most
1167	   acceptable is any proposal that enables an edge network to be
1168	   operationally independent of its upstreams, with no obligation to
1169	   renumber when it adds, drops, or changes ISPs, and with no additional
1170	   burden placed either on the ISP or the edge network as a result.
1171	   From an application perspective, an additional operational
1172	   requirement in the words of US NIST's Roadmap for the Smart Grid, is
1173	   that

1175	      "...the Network should enable an application in a particular
1176	      domain to communicate with an application in any other domain in
1177	      the information network, with proper management control over who
1178	      and where applications can be interconnected."

1180	   In other words, the structure of the network should allow for and
1181	   enable appropriate access control, but the structure of the network
1182	   should not inherently limit access.

1184	   The GSE model, by statelessly translating the prefix between an edge
1185	   network and its upstream transit network, accomplishes that with a
1186	   minimum of fuss and bother.  Stated in the simplest terms, it enables
1187	   the edge network to behave as if it has a provider-independent prefix
1188	   from a multihoming and renumbering perspective without the overhead
1189	   of RIR membership or maintaining BGP connectivity, and it enables the
1190	   transit networks to aggressively aggregate what are from their
1191	   perspective provider-allocated customer prefixes, to maintain a
1192	   rational-sized routing table.

1194	Appendix B.  Verification code

1196	   This non-normative appendix is presented as a proof of concept.  It
1197	   is in no sense optimized; for example, one's complement arithmetic is
1198	   implemented in portable subroutines, where operational
1199	   implementations might use one's complement arithmetic instructions
1200	   through a pragma; such implementations probably need to explicitly
1201	   force 0xFFFF to 0x0000, as the instruction will not.  The original
1202	   purpose of the code was to verify whether or not it was necessary to
1203	   suppress 0xFFFF by overwriting with zero, and whether predicted
1204	   issues with subnet numbering were real.

1206	   The point is to

1208	   o  demonstrate that if one or the other representation of zero is not
1209	      used in the word the checksum is updated in, the program maps
1210	      inner and outer addresses in a manner that is, mathematically, 1:1
1211	      and onto (each inner address maps to a unique outer address, and
1212	      that outer address maps back to exactly the same inner address),
1213	      and

1215	   o  give guidance on the suppression of 0xFFFF checksums.

1217	   In short, in one's complement arithmetic, x-x=0, but will take the
1218	   negative representation of zero.  If 0xFFFF results are forced to the
1219	   value 0x0000, as is recommended in [RFC1071], the word the checksum
1220	   is adjusted in cannot be initially 0xFFFF, as on the return it will
1221	   be forced to 0.  If 0xFFFF results are not forced to the value 0x0000
1222	   as is recommended in [RFC1071], the word the checksum is adjusted in
1223	   cannot be initially 0, as on the return it will be calculated as
1224	   0+(~0) = 0xFFFF.  We chose to follow [RFC1071]'s recommendations,
1225	   which implies a requirement to not use 0xFFFF as a subnet number in
1226	   networks with a /48 external prefix.

1228	   /*
1229	    * Copyright (c) 2010 IETF Trust and the persons identified as
1230	    * authors of the code.  All rights reserved.  Redistribution
1231	    * and use in source and binary forms, with or without
1232	    * modification, are permitted provided that the following
1233	    * conditions are met:

1235	    *
1236	    * o  Redistributions of source code must retain the above
1237	    *    copyright notice, this list of conditions and the
1238	    *    following disclaimer.
1239	    *
1240	    * o  Redistributions in binary form must reproduce the above
1241	    *    copyright notice, this list of conditions and the
1242	    *    following disclaimer in the documentation and/or other
1243	    *    materials provided with the distribution.
1244	    *
1245	    * o  Neither the name of Internet Society, IETF or IETF Trust,
1246	    *    nor the names of specific contributors, may be used to
1247	    *    endorse or promote products derived from this software
1248	    *    without specific prior written permission.
1249	    *
1250	    * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
1251	    * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
1252	    * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
1253	    * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
1254	    * DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT OWNER OR
1255	    * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
1256	    * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
1257	    * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
1258	    * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
1259	    * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
1260	    * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
1261	    * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
1262	    * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
1263	    */
1264	   #include "stdio.h"
1265	   #include "assert.h"
1266	   /*
1267	    * program to verify the NPTv6 algorithm
1268	    *
1269	    * argument:
1270	    *     perform negative zero suppression: boolean
1271	    *
1272	    * method:
1273	    *    We specify an internal and an external prefix. The prefix
1274	    *    length is presumed to be the common length of both, and for
1275	    *    this is a /48. We perform the three algorithms specified.
1276	    *    the "packet" address is in effect the source address
1277	    *    internal->external and the destination address
1278	    *    external->internal.
1279	    */
1280	   unsigned short  inner_init[] = {
1281	       0xFD01, 0x0203, 0x0405, 1, 2, 3, 4, 5};
1282	   unsigned short  outer_init[] = {
1283	       0x2001, 0x0db8, 0x0001, 1, 2, 3, 4, 5};
1284	   unsigned short  inner[8];
1285	   unsigned short  packet[8];
1286	   unsigned char   checksum[65536] = {0};
1287	   unsigned short  outer[8];
1288	   unsigned short  adjustment;
1289	   unsigned short  suppress;
1290	   /*
1291	    * One's complement sum.
1292	    * return number1 + number2
1293	    */
1294	   unsigned short
1295	   add1(number1, number2)
1296	       unsigned short  number1;
1297	       unsigned short  number2;
1298	   {
1299	       unsigned int    result;

1301	       result = number1;
1302	       result += number2;
1303	       if (suppress) {
1304	           while (0xFFFF <= result) {
1305	               result = result + 1 - 0x10000;
1306	           }
1307	       } else {
1308	           while (0xFFFF < result) {
1309	               result = result + 1 - 0x10000;
1310	           }
1311	       }
1312	       return result;
1313	   }

1315	   /*
1316	    * One's complement difference
1317	    * return number1 - number2
1318	    */
1319	   unsigned short
1320	   sub1(number1, number2)
1321	       unsigned short  number1;
1322	       unsigned short  number2;
1323	   {
1324	       return add1(number1, ~number2);
1325	   }

1327	   /*
1328	    * return one's complement sum of an array of numbers
1329	    */
1330	   unsigned short
1331	   sum1(numbers, count)
1332	       unsigned short *numbers;
1333	       int             count;
1334	   {
1335	       unsigned int    result;

1337	       result = *numbers++;
1338	       while (--count > 0) {
1339	           result += *numbers++;
1340	       }

1342	       if (suppress) {
1343	           while (0xFFFF <= result) {
1344	               result = result + 1 - 0x10000;
1345	           }
1346	       } else {
1347	           while (0xFFFF < result) {
1348	               result = result + 1 - 0x10000;
1349	           }
1350	       }
1351	       return result;
1352	   }

1354	   /*
1355	    * NPTv6 initialization: section 3.1 assuming section 3.4
1356	    *
1357	    * create the /48, a source address in internal format, and a
1358	    * source address in external format. calculate the adjustment
1359	    * if one /48 is overwritten with the other.
1360	    */
1361	   void
1362	   nptv6_initialization(subnet)
1363	       unsigned short  subnet;
1364	   {
1365	       int             i;
1366	       unsigned short  inner48;
1367	       unsigned short  outer48;

1369	       /* initialize the internal and external prefixes. */
1370	       for (i = 0; i < 8; i++) {
1371	           inner[i] = inner_init[i];
1372	           outer[i] = outer_init[i];
1373	       }
1374	       inner[3] = subnet;
1375	       outer[3] = subnet;
1376	       /* calculate the checksum adjustment */
1377	       inner48 = sum1(inner, 3);
1378	       outer48 = sum1(outer, 3);
1379	       adjustment = sub1(inner48, outer48);
1380	   }

1382	   /*
1383	    * NPTv6 packet from edge to transit: section 3.2 assuming
1384	    * section 3.4
1385	    *
1386	    * overwrite the prefix in the source address with the outer
1387	    * prefix, and adjust the checksum
1388	    */
1389	   void
1390	   nptv6_inner_to_outer()
1391	   {
1392	       int             i;

1394	       /* let's get the source address into the packet */
1395	       for (i = 0; i < 8; i++) {
1396	           packet[i] = inner[i];
1397	       }

1399	       /* overwrite the prefix with the outer prefix */
1400	       for (i = 0; i < 3; i++) {
1401	           packet[i] = outer[i];
1402	       }

1404	       /* adjust the checksum */
1405	       packet[3] = add1(packet[3], adjustment);
1406	   }

1408	   /*
1409	    * NPTv6 packet from transit to edge:: section 3.3 assuming
1410	    * section 3.4
1411	    *
1412	    * overwrite the prefix in the destination address with the
1413	    * inner prefix, and adjust the checksum
1414	    */
1415	   void
1416	   nptv6_outer_to_inner()
1417	   {
1418	       int             i;

1420	       /* overwrite the prefix with the outer prefix */
1421	       for (i = 0; i < 3; i++) {
1422	           packet[i] = inner[i];
1423	       }

1425	       /* adjust the checksum */
1426	       packet[3] = sub1(packet[3], adjustment);

1428	   }

1430	   /*
1431	    * main program
1432	    */
1433	   main(argc, argv)
1434	       int             argc;
1435	       char          **argv;
1436	   {
1437	       unsigned        subnet;
1438	       int             i;

1440	       if (argc < 2) {
1441	              fprintf(stderr, "usage: nptv6 supression\n");
1442	              assert(0);
1443	          }
1444	          suppress = atoi(argv[1]);
1445	          assert(suppress <= 1);

1447	          for (subnet = 0; subnet < 0x10000; subnet++) {
1448	              /* section 3.1: initialize the system */
1449	              nptv6_initialization(subnet);

1451	              /* section 3.2: take a packet from inside to outside */
1452	              nptv6_inner_to_outer();

1454	              /* the resulting checksum value should be unique */
1455	              if (checksum[subnet]) {
1456	                   printf("inner->outer duplicated checksum: "
1457	                          "inner: %x:%x:%x:%x:%x:%x:%x:%x(%x) "
1458	                          "calculated: %x:%x:%x:%x:%x:%x:%x:%x(%x)\n",
1459	                          inner[0], inner[1], inner[2], inner[3],
1460	                          inner[4], inner[5], inner[6], inner[7],
1461	                          sum1(inner, 8),
1462	                          packet[0], packet[1], packet[2], packet[3],
1463	                          packet[4], packet[5], packet[6], packet[7],
1464	                          sum1(packet, 8));
1465	           }

1467	           checksum[subnet] = 1;

1469	           /*
1470	            * the resulting checksum should be the same as the inner
1471	            * address's checksum
1472	            */
1473	           if (sum1(packet, 8) != sum1(inner, 8)) {
1474	               printf("inner->outer incorrect: "
1475	                      "inner: %x:%x:%x:%x:%x:%x:%x:%x(%x) "
1476	                      "calculated: %x:%x:%x:%x:%x:%x:%x:%x(%x)\n",
1477	                      inner[0], inner[1], inner[2], inner[3],
1478	                      inner[4], inner[5], inner[6], inner[7],
1479	                      sum1(inner, 8),
1480	                      packet[0], packet[1], packet[2], packet[3],
1481	                      packet[4], packet[5], packet[6], packet[7],
1482	                      sum1(packet, 8));
1483	           }

1485	           /* section 3.3: take a packet from outside to inside */
1486	           nptv6_outer_to_inner();

1488	           /*
1489	            * the returning packet should have the same checksum it
1490	            * left with
1491	            */
1492	           if (sum1(packet, 8) != sum1(inner, 8)) {
1493	               printf("outer->inner checksum incorrect: "
1494	                      "calculated: %x:%x:%x:%x:%x:%x:%x:%x(%x) "
1495	                      "inner: %x:%x:%x:%x:%x:%x:%x:%x(%x)\n",
1496	                      packet[0], packet[1], packet[2], packet[3],
1497	                      packet[4], packet[5], packet[6], packet[7],
1498	                   sum1(packet, 8), inner[0], inner[1], inner[2],
1499	                      inner[3], inner[4], inner[5], inner[6],
1500	                      inner[7], sum1(inner, 8));
1501	           }

1503	           /*
1504	            * and every octet should calculate back to the same inner
1505	            * value
1506	            */
1507	           for (i = 0; i < 8; i++) {
1508	               if (inner[i] != packet[i]) {
1509	                   printf("outer->inner different: "
1510	                          "calculated: %x:%x:%x:%x:%x:%x:%x:%x "
1511	                          "inner: %x:%x:%x:%x:%x:%x:%x:%x\n",
1512	                      packet[0], packet[1], packet[2], packet[3],
1513	                      packet[4], packet[5], packet[6], packet[7],
1514	                          inner[0], inner[1], inner[2], inner[3],
1515	                          inner[4], inner[5], inner[6], inner[7]);
1516	                   break;
1517	               }
1518	           }
1519	       }
1520	   }

1522	Authors' Addresses

1524	   Margaret Wasserman
1525	   Painless Security
1526	   North Andover, MA  01845
1527	   USA

1529	   Phone: +1 781 405 7464
1530	   Email: mrw@painless-security.com
1531	   URI:   http://www.painless-secuirty.com

1533	   Fred Baker
1534	   Cisco Systems
1535	   Santa Barbara, California  93117
1536	   USA

1538	   Phone: +1-408-526-4257
1539	   Email: fred@cisco.com