idnits 2.17.1 draft-mrw-nat66-15.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC2119]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 1360 has weird spacing: '...d short inner...' == Line 1362 has weird spacing: '...d short outer...' == Line 1364 has weird spacing: '...d short inner...' == Line 1365 has weird spacing: '...d short datag...' == Line 1366 has weird spacing: '...ed char chec...' == (12 more instances...) -- The document date (April 24, 2011) is 4751 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Experimental ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '3' on line 1594 -- Looks like a reference, but probably isn't: '1' on line 1593 -- Looks like a reference, but probably isn't: '0' on line 1593 -- Looks like a reference, but probably isn't: '2' on line 1594 -- Looks like a reference, but probably isn't: '4' on line 1594 -- Looks like a reference, but probably isn't: '5' on line 1594 -- Looks like a reference, but probably isn't: '6' on line 1595 -- Looks like a reference, but probably isn't: '7' on line 1595 -- Obsolete informational reference (is this intentional?): RFC 2629 (Obsoleted by RFC 7749) -- Obsolete informational reference (is this intentional?): RFC 3484 (Obsoleted by RFC 6724) -- Obsolete informational reference (is this intentional?): RFC 5245 (Obsoleted by RFC 8445, RFC 8839) -- Obsolete informational reference (is this intentional?): RFC 5389 (Obsoleted by RFC 8489) -- Obsolete informational reference (is this intentional?): RFC 5766 (Obsoleted by RFC 8656) -- Obsolete informational reference (is this intentional?): RFC 5996 (Obsoleted by RFC 7296) Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 16 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Wasserman 3 Internet-Draft Painless Security 4 Intended status: Experimental F. Baker 5 Expires: October 26, 2011 Cisco Systems 6 April 24, 2011 8 IPv6-to-IPv6 Network Prefix Translation 9 draft-mrw-nat66-15 11 Abstract 13 This document describes a stateless, transport-agnostic IPv6-to-IPv6 14 Network Prefix Translation (NPTv6) function that provides the address 15 independence benefit associated with IPv4-to-IPv4 NAT (NAPT44), and 16 in addition provides a 1:1 relationship between addresses in the 17 "inside" and "outside" prefixes, preserving end to end reachability 18 at the network layer. 20 Requirements Terminology 22 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 23 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 24 document are to be interpreted as described in RFC 2119 [RFC2119]. 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on October 26, 2011. 43 Copyright Notice 45 Copyright (c) 2011 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 1.1. What is Address Independence? . . . . . . . . . . . . . . 5 62 1.2. NPTv6 Applicability . . . . . . . . . . . . . . . . . . . 6 63 2. NPTv6 Overview . . . . . . . . . . . . . . . . . . . . . . . . 8 64 2.1. NPTv6: the simplest case . . . . . . . . . . . . . . . . 8 65 2.2. NPTv6 between peer networks . . . . . . . . . . . . . . . 9 66 2.3. NPTv6 redundancy and load-sharing . . . . . . . . . . . . 9 67 2.4. NPTv6 multihoming . . . . . . . . . . . . . . . . . . . . 10 68 2.5. Mapping with No Per-Flow State . . . . . . . . . . . . . 11 69 2.6. Checksum-Neutral Mapping . . . . . . . . . . . . . . . . 11 70 3. NPTv6 Algorithmic Specification . . . . . . . . . . . . . . . 12 71 3.1. NPTv6 configuration calculations . . . . . . . . . . . . 12 72 3.2. NPTv6 translation, internal network to external 73 network . . . . . . . . . . . . . . . . . . . . . . . . . 13 74 3.3. NPTv6 translation, external network to internal 75 network . . . . . . . . . . . . . . . . . . . . . . . . . 13 76 3.4. NPTv6 with a /48 or shorter prefix . . . . . . . . . . . 13 77 3.5. NPTv6 with a /49 or longer prefix . . . . . . . . . . . . 14 78 3.6. /48 Prefix Mapping Example . . . . . . . . . . . . . . . 14 79 3.7. Address Mapping for Longer Prefixes . . . . . . . . . . . 15 80 4. Implications of Network Address Translator Behavioral 81 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 15 82 4.1. Prefix configuration and generation . . . . . . . . . . . 16 83 4.2. Subnet numbering . . . . . . . . . . . . . . . . . . . . 16 84 4.3. NAT Behavioral Requirements . . . . . . . . . . . . . . . 16 85 5. Implications for Applications . . . . . . . . . . . . . . . . 17 86 5.1. Recommendation for network planners considering use 87 of NPTv6 Translation . . . . . . . . . . . . . . . . . . 18 88 5.2. Recommendations for application writers . . . . . . . . . 18 89 5.3. Recommendation for future work . . . . . . . . . . . . . 19 90 6. A Note on Port Mapping . . . . . . . . . . . . . . . . . . . . 19 91 7. Security Considerations . . . . . . . . . . . . . . . . . . . 20 92 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 93 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 20 94 10. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . 21 95 10.1. Changes Between draft-mrw-behave-nat66-00 and -01 . . . . 21 96 10.2. Changes between *behave-nat66-01 and -02 . . . . . . . . 21 97 10.3. Changes between *nat66-00 and *nat66-01 . . . . . . . . . 22 98 10.4. Changes between *nat66-01 and *nat66-02 . . . . . . . . . 22 99 10.5. Changes between *nat66-02 and *nat66-03 . . . . . . . . . 23 100 10.6. Changes between *nat66-03 and *nat66-04 . . . . . . . . . 23 101 10.7. Changes between *nat66-04 and *nat66-05 . . . . . . . . . 23 102 10.8. Changes between *nat66-05 and *nat66-06 . . . . . . . . . 23 103 10.9. Changes between *nat66-06 and *nat66-07 . . . . . . . . . 23 104 10.10. Changes between *nat66-07 and *nat66-08 . . . . . . . . . 23 105 10.11. Changes up to *nat66-10 . . . . . . . . . . . . . . . . . 23 106 10.12. Changes up to *nat66-11 and -12 . . . . . . . . . . . . . 23 107 10.13. Changes for *nat66-13 . . . . . . . . . . . . . . . . . . 24 108 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 24 109 11.1. Normative References . . . . . . . . . . . . . . . . . . 24 110 11.2. Informative References . . . . . . . . . . . . . . . . . 24 111 Appendix A. Why GSE? . . . . . . . . . . . . . . . . . . . . . . 26 112 Appendix B. Verification code . . . . . . . . . . . . . . . . . . 28 113 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 35 115 1. Introduction 117 This document describes a stateless IPv6-to-IPv6 Network Prefix 118 Translation (NPTv6) function, designed to provide address 119 independence to the edge network. It is transport-agnostic with 120 respect to transports that don't checksum the IP header, such as 121 SCTP, and to transports that use the TCP/UDP/DCCP pseudo-header and 122 checksum [RFC1071]. 124 Note that, for reasons discussed in [RFC2993] and Section 5, the IETF 125 does not generally recommend the use of Network Address Translation 126 technology for IPv6. Where Network Address Translation is 127 implemented, however, this specification provides a mechanism that 128 has less architectural problems than merely implementing a 129 traditional IPv4 NAT in an IPv6 environment. Some problems remain, 130 however, and the reader should consult Section 5, [RFC4864], and 131 [RFC5902], for the implications and approaches that help avoid all 132 types of NATs. 134 The stateless approach described in this document has several 135 ramifications: 137 o Any security benefit that NAPT44 might offer is not present in 138 NPTv6, necessitating the use of a firewall to obtain those 139 benefits if desired. An example of such a firewall is described 140 in [RFC6092]. 142 o End to end reachability is preserved, although the address used 143 "inside" the edge network differs from the address used "outside" 144 the edge network. This has implications for application referrals 145 and other uses of Internet layer addresses. 147 o If there are multiple identically-configured prefix translators 148 between two networks, there is no need for them to exchange 149 dynamic state, as there is no dynamic state - the algorithmic 150 translation will be identical across each of them. The network 151 can therefore asymmetrically route, load-share, and fail-over 152 among them without issue. 154 o Since translation is 1:1 at the network layer, there is no need to 155 modify port numbers or other transport parameters. 157 o TCP sessions that authenticate peers using the TCP Authentication 158 Option [RFC5925] cannot have their addresses translated, as the 159 addresses are used in the calculation of the Message 160 Authentication Code. This consideration applies in general to any 161 UNilateral Self-Address Fixing (UNSAF) [RFC3424] Protocol, which 162 the IAB recommends against the deployment of in an environment 163 that changes Internet addresses. 165 o Applications using the Internet Key Exchange Protocol Version 2 166 (IKEv2) [RFC5996] should, at least in theory, detect the presence 167 of the translator; while no NAT traversal solution is required, 168 [RFC5996] would require such sessions to use UDP. 170 1.1. What is Address Independence? 172 For the purposes of this document, IPv6 Address Independence consists 173 of the following set of properties: 175 From the perspective of the edge network: 177 * The IPv6 addresses used inside the local network (for 178 interfaces, access lists, and logs) do not need to be 179 renumbered if the global prefix(es) assigned for use by the 180 edge network are changed. 182 * The IPv6 addresses used inside the edge network (for 183 interfaces, access lists, and logs) or within other upstream 184 networks (such as when multihoming) do not need to be 185 renumbered when a site adds, drops, or changes upstream 186 networks. 188 * It is not necessary for an administration to convince an 189 upstream network to route its internal IPv6 prefixes, or for it 190 to advertise prefixes derived from other upstream networks into 191 it. 193 * Unless it wants to optimize routing between multiple upstream 194 networks in the process of multihoming, there is therefore no 195 need for a BGP exchange with the upstream network. 197 From the perspective of the upstream network: 199 * IPv6 addresses used by the edge network are guaranteed to have 200 a provider-allocated prefix, eliminating the need and concern 201 for BCP 38 [RFC2827] ingress filtering and the advertisement of 202 customer-specific prefixes. 204 Thus, address independence has ramifications for the edge network, 205 networks it directly connects with (especially its upstream 206 networks), and for the Internet as a whole. The desire for address 207 independence has been a primary driver for IPv4 NAT deployment in 208 medium to large-sized enterprise networks, including NAT deployments 209 in enterprises that have plenty of IPv4 provider independent address 210 space (from IPv4 "swamp space"). It has also been a driver for edge 211 networks to become members of Regional Internet Registry (RIR) 212 communities, seeking to obtain BGP Autonomous System Numbers and 213 provider independent prefixes, and as a result has been one of the 214 drivers of the explosion of the IPv4 route table. Service providers 215 have stated that the lack of address independence from their 216 customers has been a negative incentive to deployment, due to the 217 impact of customer routing expected in their networks. 219 The Local Network Protection [RFC4864] document discusses a related 220 concept called "Address Autonomy" as a benefit of NAPT44. [RFC4864] 221 indicates that address autonomy can be achieved by the simultaneous 222 use of global addresses on all nodes within a site that need external 223 connectivity, and Unique Local Addresses (ULAs) [RFC4193] for all 224 internal communication. However, this solution fails to meet the 225 requirement for address independence, because if an ISP renumbering 226 event occurs, all of the hosts, routers, DHCP servers, ACLs, 227 firewalls and other internal systems that are configured with global 228 addresses from the ISP will need to be renumbered before global 229 connectivity is fully restored. 231 The use of IPv6 Provider Independent (PI) addresses has also been 232 suggested as a means to fulfill the address independence requirement. 233 However, this solution requires that an enterprise qualify to receive 234 a PI assignment and persuade their ISP to install specific routes for 235 the enterprise's PI addresses. There are a number of practical 236 issues with this approach, especially if there is a desire to route 237 to a number of geographically and topologically diverse set of sites, 238 which can sometimes involve coordinating with several ISPs to route 239 portions of a single PI prefix. These problems have caused numerous 240 enterprises with plenty of IPv4 swamp space to choose to use IPv4 NAT 241 for part, or substantially all, of their internal network instead of 242 using their provider independent address space. 244 1.2. NPTv6 Applicability 246 NPTv6 provides a simple and compelling solution to meet the Address 247 Independence requirement in IPv6. The address independence benefit 248 stems directly from the translation function of the network prefix 249 translator. To avoid as many of the issues associated with NAPT44 as 250 possible, NPTv6 is defined to include a two-way, checksum-neutral, 251 algorithmic translation function, and nothing else. 253 The fact that NPTv6 does not map ports and is checksum-neutral avoids 254 the need for an NPTv6 Translator to re-write transport layer headers. 255 This makes it feasible to deploy new or improved transport layer 256 protocols without upgrading NPTv6 Translators. Similarly, since 257 NPTv6 does not re-write transport layer headers, NPTv6 will not 258 interfere with encryption of the full IP payload in many cases. 260 The default NPTv6 address mapping mechanism is purely algorithmic, so 261 NPTv6 translators do not need to maintain per-node or per-connection 262 state, allowing deployment of more robust and adaptive networks than 263 can be deployed using NAPT44. Since the default NPTv6 mapping can be 264 performed in either direction, it does not interfere with inbound 265 connection establishment, thus allowing internal nodes to participate 266 in direct Peer-to-Peer applications without the application layer 267 overhead one finds in many IPv4 Peer-to-Peer applications. 269 Although NPTv6 compares favorably to NAPT44 in several ways, it does 270 not eliminate all of the architectural problems associated with IPv4 271 NAT, as described in [RFC2993]. NPTv6 involves modifying IP headers 272 in transit, so it is not compatible with security mechanisms, such as 273 the IPsec Authentication Header, that provide integrity protection 274 for the IP header. NPTv6 may interfere with the use of application 275 protocols that transmit IP addresses in the application-specific 276 portion of the IP datagram. These applications currently require 277 application layer gateways (ALGs) to work correctly through NAPT44 278 devices, and similar ALGs may be required for these applications to 279 work through NPTv6 Translators. The use of separate internal and 280 external prefixes creates complexity for DNS deployment, due to the 281 desire for internal nodes to communicate with other internal nodes 282 using internal addresses, while external nodes need to obtain 283 external addresses to communicate with the same nodes. This 284 frequently results in the deployment of "split DNS", which may add 285 complexity to network configuration. 287 The choice of address within the edge network bears consideration. 288 One could use a ULA, which maximizes address independence. That 289 could also be considered a misuse of the ULA; if the expectation is 290 that a ULA prevents access to a system from outside the range of the 291 ULA, NPTv6 overrides that. On the other hand, the administration is 292 aware that it has made that choice, and could if it desired deploy a 293 second ULA for the purpose of privacy; the only prefix that will be 294 translated is one that has an NPTv6 Translator configured to 295 translate to or from it. Also, using any other global scope address 296 format makes one either obtain a PI prefix or be at the mercy of the 297 agency from which it was allocated. 299 There are significant technical impacts associated with the 300 deployment of any prefix translation mechanism, including NPTv6, and 301 we strongly encourage anyone who is considering the implementation or 302 deployment of NPTv6 to read [RFC4864] and [RFC5902], and to carefully 303 consider the alternatives described in that document, some of which 304 may cause fewer problems than NPTv6. 306 2. NPTv6 Overview 308 NPTv6 may be implemented in an IPv6 router to map one IPv6 address 309 prefix to another IPv6 prefix as each IPv6 datagram transits the 310 router. A router that implements an NPTv6 prefix translation 311 function is referred to as an NPTv6 Translator. 313 2.1. NPTv6: the simplest case 315 In its simplest form, an NPTv6 Translator interconnects two network 316 links, one of which is an "internal" network link attached to a leaf 317 network within a single administrative domain, and the other of which 318 is an "external" network with connectivity to the global Internet. 319 All of the hosts on the internal network will use addresses from a 320 single, locally-routed prefix, and those addresses will be translated 321 to/from addresses in a globally-routable prefix as IP datagrams 322 transit the NPTv6 Translator. The lengths of these two prefixes will 323 be functionally the same; if they differ, the longer of the two will 324 limit the ability to use subnets in the shorter. 326 External Network: Prefix = 2001:0DB8:0001:/48 327 -------------------------------------- 328 | 329 | 330 +-------------+ 331 | NPTv6 | 332 | Translator | 333 +-------------+ 334 | 335 | 336 -------------------------------------- 337 Internal Network: Prefix = FD01:0203:0405:/48 339 Figure 1: A simple translator 341 Figure 1 shows an NPTv6 Translator attached to two networks. In this 342 example, the internal network uses IPv6 Unique Local Addresses (ULAs) 343 [RFC4193] to represent the internal IPv6 nodes, and the external 344 network uses globally routable IPv6 addresses to represent the same 345 nodes. 347 When an NPTv6 Translator forwards datagrams in the "outbound" 348 direction, from the internal network to the external network, NPTv6 349 overwrites the IPv6 source prefix (in the IPv6 header) with a 350 corresponding external prefix. When datagrams are forwarded in the 351 "inbound" direction, from the external network to the internal 352 network, the IPv6 destination prefix is overwritten with a 353 corresponding internal prefix. Using the prefixes shown in the 354 diagram above, as an IP datagram passes through the NPTv6 Translator 355 in the outbound direction, the source prefix (FD01:0203:0405:/48) 356 will be overwritten with the external prefix (2001:0DB8:0001:/48). 357 In an inbound datagram, the destination prefix (2001:0DB8:0001:/48) 358 will be overwritten with the internal prefix (FD01:0203:0405:/48). 359 In both cases, it is the local IPv6 prefix that is overwritten; the 360 remote IPv6 prefix remains unchanged. Nodes on the internal network 361 are said to be "behind" the NPTv6 Translator. 363 2.2. NPTv6 between peer networks 365 NPTv6 can also be used between two private networks. In these cases, 366 both networks may use ULA prefixes, with each subnet in one network 367 mapped into a corresponding subnet in the other network, and vice 368 versa. Or, each network may use ULA prefixes for internal 369 addressing, and global unicast addresses on the other network. 371 Internal Prefix = FD01:4444:5555:/48 372 -------------------------------------- 373 V | External Prefix 374 V | 2001:0DB8:6666:/48 375 V +---------+ ^ 376 V | NPTv6 | ^ 377 V | Device | ^ 378 V +---------+ ^ 379 External Prefix | ^ 380 2001:0DB8:0001:/48 | ^ 381 -------------------------------------- 382 Internal Prefix = FD01:0203:0405:/48 384 Figure 2: Flow of Information in Translation 386 2.3. NPTv6 redundancy and load-sharing 388 In some cases, more than one NPTv6 Translator may be attached to a 389 network, as shown in Figure 3. In such cases, NPTv6 Translators are 390 configured with the same internal and external prefixes. Since there 391 is only one translation, even though there are multiple translators, 392 they map only one external address (prefix and IID) to the internal 393 address. 395 External Network: Prefix = 2001:0DB8:0001:/48 396 -------------------------------------- 397 | | 398 | | 399 +-------------+ +-------------+ 400 | NPTv6 | | NPTv6 | 401 | Translator | | Translator | 402 | #1 | | #2 | 403 +-------------+ +-------------+ 404 | | 405 | | 406 -------------------------------------- 407 Internal Network: Prefix = FD01:0203:0405:/48 409 Figure 3: Parallel Translators 411 2.4. NPTv6 multihoming 413 External Network #1: External Network #2: 414 Prefix = 2001:0DB8:0001:/48 Prefix = 2001:0DB8:5555:/48 415 --------------------------- -------------------------- 416 | | 417 | | 418 +-------------+ +-------------+ 419 | NPTv6 | | NPTv6 | 420 | Translator | | Translator | 421 | #1 | | #2 | 422 +-------------+ +-------------+ 423 | | 424 | | 425 -------------------------------------- 426 Internal Network: Prefix = FD01:0203:0405:/48 428 Figure 4: Parallel Translators with different upstream networks 430 When multihoming, NPTv6 Translators are attached to an internal 431 network, as shown in Figure 4, but connected to different external 432 networks. In such cases, NPTv6 Translators are configured with the 433 same internal prefix, but different external prefixes. Since there 434 are multiple translations, they map multiple external addresses 435 (prefix and IID) to the common internal address. A system within the 436 edge network is unable to determine which external address it is 437 using apart from services such as STUN [RFC5389]. 439 Multihoming in this sense has one negative feature as compared with 440 multihoming with a provider independent address; when routes change 441 between NPTv6 Translators, since the upstream network changes, the 442 translated prefix can change. This would cause sessions and 443 referrals dependent on it to fail as well. This is not expected to 444 be a major issue, however, in networks where routing is generally 445 stable. 447 2.5. Mapping with No Per-Flow State 449 When NPTv6 is used as described in this document, no per-node or per- 450 flow state is maintained in the NPTv6 Translator. Both inbound and 451 outbound datagrams are translated algorithmically, using only 452 information found in the IPv6 header. Due to this property, NPTv6's 453 two-way, algorithmic address mapping can support both outbound and 454 inbound connection establishment without the need for state-priming 455 or rendezvous mechanisms, or the maintenance of mapping state. This 456 is a significant improvement over NAPT44 devices, but it also has 457 significant security implications which are described in Section 7. 459 2.6. Checksum-Neutral Mapping 461 When a change is made to one of the IP header fields in the IPv6 462 pseudo-header checksum (such as one of the IP addresses), the 463 checksum field in the transport layer header may become invalid. 464 Fortunately, an incremental change in the area covered by the 465 Internet standard checksum [RFC1071] will result in a well-defined 466 change to the checksum value [RFC1624]. So, a checksum change caused 467 by modifying part of the area covered by the checksum can be 468 corrected by making a complementary change to a different 16-bit 469 field covered by the same checksum. 471 The NPTv6 mapping mechanisms described in this document are checksum- 472 neutral, which means that they result in IP headers that will 473 generate the same IPv6 pseudo-header checksum when the checksum is 474 calculated using the standard Internet checksum algorithm [RFC1071]. 475 Any changes that are made during translation of the IPv6 prefix are 476 offset by changes to other parts of the IPv6 address. This results 477 in transport layers that use the Internet checksum (such as TCP and 478 UDP) calculating the same IPv6 pseudo header checksum for both the 479 internal and external forms of the same datagram, which avoids the 480 need for the NPTv6 Translator to modify those transport layer headers 481 to correct the checksum value. 483 The outgoing checksum correction is achieved by making a change to a 484 16 bit section of the source address that is not used for routing in 485 the external network. Due to the nature of checksum arithmetic, when 486 the corresponding correction is applied to the same bits of 487 destination address of the inbound packet, the DA is returned to the 488 correct internal value. 490 As noted in Section 4.2, this mapping results in an edge network 491 using a /48 external prefix to be unable to use subnet 0xFFFF. 493 3. NPTv6 Algorithmic Specification 495 The [RFC4291] IPv6 Address is reproduced for clarity in Figure 5. 497 0 15 16 31 32 47 48 63 64 79 80 95 96 111 112 127 498 +-------+-------+-------+-------+-------+-------+-------+-------+ 499 | Routing Prefix | Subnet| Interface Identifier (IID) | 500 +-------+-------+-------+-------+-------+-------+-------+-------+ 502 Figure 5: Enumeration of the IPv6 Address [RFC4291] 504 3.1. NPTv6 configuration calculations 506 When an NPTv6 Translation function is configured, it is configured 507 with 509 o one or more "internal" interfaces with their "internal" routing 510 domain prefixes, and 512 o one or more "external" interfaces with their "external" routing 513 domain prefixes. 515 In the simple case, there is one of each. If a single router 516 provides NPTv6 translation services between a multiplicity of domains 517 (as might be true when multihoming), each internal/external pair must 518 be thought of as a separate NPTv6 Translator from the perspective of 519 this specification. 521 When an NPTv6 Translator is configured, the translation function 522 first ensures that the internal and external prefixes are the same 523 length, if necessary by extending the shorter of the two with zeroes. 524 These two prefixes will be used in the prefix translation function 525 described in Section 3.2 and Section 3.3. 527 They are then zero-extended to /64, for the purposes of a 528 calculation. The translation function calculates the ones-complement 529 sum of the 16 bit words of the /64 external prefix and the /64 530 internal prefix. It then calculates the difference between these 531 values: internal minus external. This value, called the 532 "adjustment", is effectively constant for the lifetime of the NPTv6 533 Translator configuration, and used in per-datagram processing. 535 3.2. NPTv6 translation, internal network to external network 537 When a datagram passes through the NPTv6 Translator from an internal 538 to an external network, its IPv6 Source Address is changed in two 539 ways: 541 o If the internal subnet number has no mapping, such as being 0xFFFF 542 or simply not mapped, discard the datagram. This SHOULD result in 543 an ICMP Destination Unreachable. 545 o The internal prefix is overwritten with the external prefix, in 546 effect subtracting the difference between the two checksums (the 547 adjustment) from the pseudo-header's checksum, and 549 o A 16-bit word of the address has the adjustment added to it using 550 one's complement arithmetic. If the result is 0xFFFF, it is 551 overwritten as zero. The choice of word is as specified in 552 Section 3.4 or Section 3.5 as appropriate. 554 3.3. NPTv6 translation, external network to internal network 556 When a datagram passes through the NPTv6 Translator from an external 557 to an internal network, its IPv6 Destination Address is changed in 558 two ways: 560 o The external prefix is overwritten with the internal prefix, in 561 effect adding the difference between the two checksums (the 562 adjustment) to the pseudoheader's checksum, and 564 o A 16-bit word of the address has the adjustment subtracted from it 565 (bitwise inverted and added to it) it using one's complement 566 arithmetic. If the result is 0xFFFF, it is overwritten as zero. 567 The choice of word is as specified in Section 3.4 or Section 3.5 568 as appropriate. 570 3.4. NPTv6 with a /48 or shorter prefix 572 When an NPTv6 Translator is configured with internal and external 573 prefixes that are 48 bits in length (a /48) or shorter, the 574 adjustment MUST be added to or subtracted from bits 48..63 of the 575 address. 577 This mapping results in no modification of the Interface Identifier 578 (IID), which is held in the lower half of the IPv6 address, so it 579 will not interfere with future protocols that may use unique IIDs for 580 node identification. 582 NPTv6 Translator implementations MUST implement the /48 mapping. 584 3.5. NPTv6 with a /49 or longer prefix 586 When an NPTv6 Translator is configured with internal and external 587 prefixes that are longer than 48 bits in length (such as a /52, /56, 588 or /60), the adjustment must be added to or subtracted from one of 589 the words in bits 64..79, 80..95, 96..111, or 112..127 of the 590 address. While the choice of word is immaterial as long as it is 591 consistent, for consistency's sake, these words MUST be inspected in 592 that sequence, and the first that is not initially 0xFFFF chosen. 594 NPTv6 Translator implementations SHOULD implement the mapping for 595 longer prefixes. 597 3.6. /48 Prefix Mapping Example 599 For the network shown in Figure 1, the Internal Prefix is FD01:0203: 600 0405:/48, and the External Prefix is 2001:0DB8:0001:/48. 602 If a node with internal address FD01:0203:0405:0001::1234 sends an 603 outbound datagram through the NPTv6 Translator, the resulting 604 external address will be 2001:0DB8:0001:D550::1234. The resulting 605 address is obtained by calculating the checksum of both the internal 606 and external 48-bit prefixes, subtracting the internal prefix from 607 the external prefix using one's complement arithmetic to calculate 608 the "adjustment", and adding the adjustment to the 16-bit subnet 609 field (in this case 0x0001). 611 To show the work: 613 The one's complement checksum of FD01:0203:0405 is 0xFCF5. The one's 614 complement checksum of 2001:0DB8:0001 is 0xD245. Using one's 615 complement arithmetic, 0xD245 - 0xFCF5 = 0xD54F. The subnet in the 616 original datagram is 0x0001. Using one's complement arithmetic, 617 0x0001 + 0xD54F = 0xD550. Since 0xD550 != 0xFFFF, it is not changed 618 to 0x0000. 620 So, the value 0xD550 is written in the 16-bit subnet area, resulting 621 in a mapped external address of 2001:0DB8:0001:D550::1234. 623 When a response datagram is received, it will contain the destination 624 address 2001:0DB8:0001:D550::0001, which will be mapped using the 625 inverse mapping algorithm, back to FD01:0203:0405:0001::1234. 627 In this case, the difference between the two prefixes will be 628 calculated as follows: 630 Using one's complement arithmetic, 0xFCF5 - 0xD245 = 0x2AB0. The 631 subnet in the original datagram = 0xD550. Using one's complement 632 arithmetic, 0xD550 + 0x2AB0 = 0x0001. Since 0x0001 != 0xFFFF, it is 633 not changed to 0x0000. 635 So the value 0x0001 is written into the subnet field, and the 636 internal value of the subnet field is properly restored. 638 3.7. Address Mapping for Longer Prefixes 640 If the prefix being mapped is longer than 48 bits, the algorithm is 641 slightly more complex. A common case will be that the internal and 642 external prefixes are of different length. In such a case, the 643 shorter prefix is zero-extended to the length of the longer as 644 described in Section 3.1 for the purposes of overwriting the prefix. 645 Then, they are both zero-extended to 64 bits to facilitate one's 646 complement arithmetic. The "adjustment" is calculated using those 64 647 bit prefixes. 649 For example if the internal prefix is a /48 ULA and the external 650 prefix is a /56 provider-allocated prefix, the ULA becomes a /56 with 651 zeros in bits 48..55. For purposes of one's complement arithmetic, 652 they are then both zero-extended to 64 bits. A side-effect of this 653 is that a subset of the subnets possible in the shorter prefix are 654 untranslatable. While the security value of this is debatable, the 655 administration may choose to use them for subnets that it knows need 656 no external accessibility. 658 We then find the first word in the IID that does not have the value 659 0xFFFF, trying bits 64..79, and then 80..95, 96..111, and finally 660 112..127. We perform the same calculation (with the same proof of 661 correctness) as in Section 3.6, but applying it to that word. 663 Although any 16-bit portion of an IPv6 IID could contain 0xFFFF, an 664 IID of all-ones is a reserved anycast identifier that should not be 665 used on the network [RFC2526]. If an NPTv6 Translator discovers a 666 datagram with an IID of all-zeros while performing address mapping, 667 that datagram MUST be dropped, and an ICMPv6 Parameter Problem error 668 SHOULD be generated [RFC4443]. 670 Note: this mechanism does involve modification of the IID; it may not 671 be compatible with future mechanisms that use unique IIDs for node 672 identification. 674 4. Implications of Network Address Translator Behavioral Requirements 675 4.1. Prefix configuration and generation 677 NPTv6 Translators MUST support manual configuration of internal and 678 external prefixes, and MUST NOT place any restrictions on those 679 prefixes except that they be valid IPv6 unicast prefixes as described 680 in [RFC4291]. They MAY also support random generation of ULA 681 addresses on command. Since the most common place anticipated for 682 the implementation of an NPTv6 Translator is a CPE router, the reader 683 is urged to consider the requirements of 684 [I-D.ietf-v6ops-ipv6-cpe-router]. 686 4.2. Subnet numbering 688 For reasons detailed in Appendix B, a network using NPTv6 Translation 689 and a /48 external prefix MUST NOT use the value 0xFFFF to designate 690 a subnet that it expects to be translated. 692 4.3. NAT Behavioral Requirements 694 NPTv6 Translators MUST support hairpinning behavior, as defined in 695 the NAT Behavioral Requirements for UDP document [RFC4787]. This 696 means that when an NPTv6 Translator receives a datagram on the 697 internal interface that has a destination address that matches the 698 site's external prefix, it will translate the datagram and forward it 699 internally. This allows internal nodes to reach other internal nodes 700 using their external, global addresses when necessary. 702 Conceptually, the datagram leaves the domain (is translated as 703 described in Section 3.2), and returns (is again translated as 704 described in Section 3.3). As a result, the datagram exchange will 705 be through the NPTv6 Translator in both directions for the lifetime 706 of the session. The alternative would be to require the NPTv6 707 Translator to drop the datagram, forcing the sender to use the 708 correct internal prefix for its peer. Performing only the external- 709 to-internal translation results in the datagram being sent from the 710 untranslated internal address of the source to the translated and 711 therefore internal address of its peer, which would enable the 712 session to bypass the NPTv6 Translator for future datagrams. It 713 would also mean that the original sender would be unlikely to 714 recognize the response when it arrived. 716 Because NPTv6 does not perform port mapping and uses a one-to-one, 717 reversible mapping algorithm, none of the other NAT behavioral 718 requirements apply to NPTv6. 720 5. Implications for Applications 722 NPTv6 Translation does not create several of the problems known to 723 exist with other kinds of NATs and discussed in [RFC2993]. In 724 particular: NPTv6 Translation is stateless, so a "reset" or brief 725 outage of an NPTv6 Translator does not break connections that 726 traverse the translation function, and if multiple NPTv6 Translators 727 exist between the same two networks, load can shift or be dynamically 728 load-shared among them. Also, an NPTv6 Translator does not aggregate 729 traffic for several hosts/interfaces behind a lesser number of 730 external addresses, so there is no inherent expectation for an NPTv6 731 Translator to block new inbound flows from external hosts, and no 732 issue with a filter or blacklist associated with one prefix within 733 the domain affecting another. A firewall can of course be used in 734 conjunction with NPTv6 Translator; this would allow the network 735 administrator more flexibility to specify security policy than would 736 be possible with a traditional NAT. 738 However, NPTv6 Translation does create difficulties for some kinds of 739 applications. Some examples include: 741 o An application instance "behind" an NPTv6 Translator will see a 742 different address for its connections than its peers "outside" the 743 NPTv6 Translator. 745 o An application instance "outside" an NPTv6 Translator will see a 746 different address for its connections than any peer "inside" an 747 NPTv6 Translator. 749 o An application instance wishing to establish communication with a 750 peer "behind" an NPTv6 Translator may need to use a different 751 address to reach that peer depending on whether the instance is 752 behind the same NPTv6 Translator or external to it. Since an 753 NPTv6 Translator implements hairpinning (Section 4.3), it suffices 754 for applications to always use their external addresses. However, 755 this creates inefficiencies in the local network and may also 756 complicate implementation of the NPTv6 Translator. [RFC3484] also 757 would prefer the private address in such a case in order to reduce 758 those inefficiencies. 760 o An application instance which moves from a realm "behind" an NPTv6 761 Translator to a realm that is "outside" the network, or vice 762 versa, may find that it is no longer able to reach its peers at 763 the same addresses it was previously able to use. 765 o An application instance which is intermittently communicating with 766 a peer that moves from behind an NPTv6 Translator to "outside" of 767 it, or vice versa, may find that it is no longer able to reach 768 that peer at the same address that it had previously used. 770 Many, but not all, of the applications which are adversely affected 771 by NPTv6 Translation are those that do "referrals" - where an 772 application instance passes its own addresses, and/or addresses of 773 its peers, to other peers. (Some believe referrals are inherently 774 undesirable; others believe that they are necessary in some 775 circumstances. A discussion of the merits of referrals, or lack 776 thereof, is beyond the scope of this document.) 778 To some extent, the incidence of these difficulties can be reduced by 779 DNS hacks that attempt to expose addresses "behind" an NPTv6 780 Translator only to hosts which are also behind the same NPTv6 781 Translator; and perhaps also, to expose only the "internal" addresses 782 of hosts behind the NPTv6 Translator to other hosts behind the same 783 NPTv6 Translator. However, this cannot be a complete solution. A 784 full discussion of these issues is out of scope for this document, 785 but briefly: (a) reliance on DNS to solve this problem depends on 786 hosts always making queries from DNS servers in the same realm as 787 they are (or on DNS interception proxies, which create their own 788 problems), and on mobile hosts/applications not caching those 789 results; (b) reliance on DNS to solve this problem depends on network 790 administrators on all networks using such applications to reliably 791 and accurately maintain current DNS entries for every host using 792 those applications; and (c) reliance on DNS to solve this problem 793 depends on applications always using DNS names, even though they 794 often must run in environments where DNS names are not reliably 795 maintained for every host. Other issues are that there is often no 796 single distinguished name for a host, no reliable way for a host to 797 determine what DNS names are associated with it, and which names are 798 appropriate to use in which contexts. 800 5.1. Recommendation for network planners considering use of NPTv6 801 Translation 803 In light of the above, network planners considering the use of NPTv6 804 translation should carefully consider the kinds of applications that 805 they will need to run in the future, and determine whether the 806 address stability and provider independence benefits are consistent 807 with their application requirements. 809 5.2. Recommendations for application writers 811 Several mechanisms (e.g. STUN [RFC5389], TURN [RFC5766], ICE 812 [RFC5245]) have been used with traditional IPv4 NAT to circumvent 813 some of the limitations of such devices. Similar mechanisms could 814 also be applied to circumvent some of the issues with NPTv6 815 Translator. However, all of these require the assistance of an 816 external server or a function co-located with the translator that can 817 tell an "internal" host what its "external" addresses are. 819 5.3. Recommendation for future work 821 It might be desirable to define a general mechanism which would allow 822 hosts within a translation domain to determine their external 823 addresses and/or request that inbound traffic be permitted. If such 824 a mechanism were to be defined, it would ideally be general enough to 825 also accommodate other types of NAT likely to be encountered by IPV6 826 applications - in particular, IPv4/IPv6 Translation 827 [I-D.ietf-behave-v6v4-framework] [I-D.ietf-behave-dns64] 828 [I-D.ietf-behave-v6v4-xlate] [I-D.ietf-behave-v6v4-xlate-stateful] 829 [RFC6052]. For this and other reasons, such a mechanism is beyond 830 the scope of this document. 832 6. A Note on Port Mapping 834 In addition to overwriting IP addresses when datagrams are forwarded, 835 NAPT44 devices overwrite the source port number in outbound traffic, 836 and the destination port number in inbound traffic. This mechanism 837 is called "port mapping". 839 The major benefit of port mapping is that it allows multiple 840 computers to share a single IPv4 address. A large number of internal 841 IPv4 addresses (typically from one of the [RFC1918] private address 842 spaces) can be mapped into a single external, globally routable IPv4 843 address, with the local port number used to identify which internal 844 node should receive each inbound datagram. This address 845 amplification feature is not generally foreseen as a necessity at 846 this time. 848 Since port mapping requires re-writing a portion of the transport 849 layer header, it requires NAPT44 devices to be aware of all of the 850 transport protocols that they forward, thus stifling the development 851 of new and improved transport protocols and preventing the use of 852 IPsec encryption. Modifying the transport layer header is 853 incompatible with security mechanisms that encrypt the full IP 854 payload, and restricts the NAPT44 to forwarding transport layers that 855 use weak checksum algorithms that are easily recalculated in routers. 857 Since there is significant detriment caused by modifying transport 858 layer headers and very little, if any, benefit to the use of port 859 mapping in IPv6, NPTv6 Translators that comply with this 860 specification MUST NOT perform port mapping. 862 7. Security Considerations 864 When NPTv6 is deployed using either of the two-way, algorithmic 865 mappings defined in the document, it allows direct inbound 866 connections to internal nodes. While this can be viewed as a benefit 867 of NPTv6 vs. NAPT44, it does open internal nodes to attacks that 868 would be more difficult in a NAPT44 network. Although this situation 869 is not substantially worse, from a security standpoint, than running 870 IPv6 with no NAT, some enterprises may assume that an NPTv6 871 Translator will offer similar protection to a NAPT44 device. 873 The port mapping mechanism in NAPT44 implementations requires that 874 state be created in both directions. This has lead to an industry- 875 wide perception that NAT functionality is the same as a stateful 876 firewall. It is not. The translation function of the NAT only 877 creates dynamic state in one direction and has no policy. For this 878 reason, it is RECOMMENDED that NPTv6 Translators also implement 879 firewall functionality such as described in [RFC6092], with 880 appropriate configuration options including turning it on or off. 882 When [RFC4864] talks about randomizing the subnet identifier, the 883 idea is to make it harder for worms to guess a valid subnet 884 identifier at an advertised network prefix. This should not be 885 interpreted as endorsing concealing the subnet identifier behind the 886 obfuscating function of a translator such as NPTv6. [RFC4864] 887 specifically talks about how to obtain the desired properties of 888 concealment without using a translator. Topology hiding when using 889 NAT is often ineffective in environments where the topology is 890 visible in application layer messaging protocols such as DNS, SIP, 891 SMTP, etc. If the information were not available through the 892 application layer, [RFC2993] would not be valid. 894 Due to the potential interactions with IKEv2/IPsec NAT traversal, it 895 would be valuable to test interactions of NPTv6 with various aspects 896 of current-day IKEv2/IPsec NAT traversal. 898 8. IANA Considerations 900 This document has no IANA considerations. 902 9. Acknowledgements 904 The checksum-neutral algorithmic address mapping described in this 905 document is based on e-mail written by Iljtsch van Beijnum. 907 The following people provided advice or review comments that 908 substantially improved this document: Allison Mankin, Christian 909 Huitema, Dave Thaler, Ed Jankiewicz, Eric Kline, Iljtsch van Beijnum, 910 Jari Arkko, Keith Moore, Mark Townsley, Merike Kaeo, Ralph Droms, 911 Remi Despres, Steve Blake, and Tony Hain. 913 This document was written using the xml2rfc tool described in RFC 914 2629 [RFC2629]. 916 10. Change Log 918 This section should be removed by the RFC Editor. 920 10.1. Changes Between draft-mrw-behave-nat66-00 and -01 922 There were several minor changes made between the *behave-nat66-00 923 and -01 versions of this draft: 925 o Added Fred Baker as a co-author. 927 o Minor arithmetic corrections. 929 o Added AH to paragraph on NAT security issues. 931 o Added additional NAT topologies to overview (diagrams TBD). 933 10.2. Changes between *behave-nat66-01 and -02 935 There were further changes made between *behave-nat66-01 and -02: 937 o Removed topology hiding mechanism. 939 o Added diagrams. 941 o Made minor updates based on mailing list feedback. 943 o Added discussion of IPv6 SAF document. 945 o Added applicability section. 947 o Added discussion of Address Independence requirement. 949 o Added hairpinning requirement and discussion of applicability of 950 other NAT behavioral requirements. 952 10.3. Changes between *nat66-00 and *nat66-01 954 There were further changes made between nat66-01 and nat66-02: 956 o Added mapping for prefixes longer than /48. 958 o Change draft name to remove reference to the behave WG. 960 o Resolved various open issues and fixed typos. 962 10.4. Changes between *nat66-01 and *nat66-02 964 o Change the acronym "NAT66" to "NPTv6", so people don't read "NAT" 965 and MEGO. 967 o Change the term used to refer to the function from "NAT66 device" 968 to "NPTv6 Translator". It's not a "device" function, it's a 969 function that is applied between two interfaces. Consider a 970 router with two upstreams and two legs in the local network; it 971 will not translate between the local legs, but will translate to 972 and from each upstream, and be configured differently for each of 973 the two ISPs. 975 o Comment specifically on the security aspects. 977 o Comment specifically on the application issues raised on this 978 list. 980 o Comment specifically on multihoming, load-sharing, and asymmetric 981 routing. 983 o Spell out the hairpinning requirement and its implications. 985 o Spell out the service provider side of Address Independence. 987 o 00 focuses on the edge's view 989 o Detail the algorithm in a manner clearer to the implementor (I 990 think) 992 o Spell out the case for GSE-style DMZs between the edge and the 993 transit network, which is about the implications for the global 994 routing table. 996 o Refer to [RFC6092] as a CPE firewall description. 998 10.5. Changes between *nat66-02 and *nat66-03 1000 o Added an appendix on Verification code 1002 o Various minor markups in response to Ralph Droms 1004 10.6. Changes between *nat66-03 and *nat66-04 1006 o Markups in response to Christian Huitema, mostly surrounding the 1007 issue of subnet 0xFFFF. 1009 o Refer to [I-D.ietf-v6ops-ipv6-cpe-router] for CPE router 1010 requirements. 1012 10.7. Changes between *nat66-04 and *nat66-05 1014 o Update statistics in appendix A per BGP report of 17 December 2010 1016 o Update security considerations using text supplied by Merike Kaeo. 1018 10.8. Changes between *nat66-05 and *nat66-06 1020 o restore a code snippet inadvertently removed in version -05 1022 10.9. Changes between *nat66-06 and *nat66-07 1024 o Changed requested status to experimental 1026 o Incorporated comments from Eric Kline 1028 10.10. Changes between *nat66-07 and *nat66-08 1030 The section on Application Considerations was expanded after 1031 discussion with Keith Moore. 1033 10.11. Changes up to *nat66-10 1035 Address review comments during IETF Last Call and the Transport 1036 Directorate Review. 1038 10.12. Changes up to *nat66-11 and -12 1040 Address Dave Thaler's comments, mostly editorial, bit also addressing 1041 UNSAF protocols like the TCP Authentication Option. 1043 10.13. Changes for *nat66-13 1045 o Inserted a sentence to make Jari happy. 1047 o Inserted a paragraph suggested by Stewart Bryant. 1049 o normalized the terms "packet" and "datagram", for consistency. 1051 11. References 1053 11.1. Normative References 1055 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1056 Requirement Levels", BCP 14, RFC 2119, March 1997. 1058 [RFC2526] Johnson, D. and S. Deering, "Reserved IPv6 Subnet Anycast 1059 Addresses", RFC 2526, March 1999. 1061 [RFC4193] Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast 1062 Addresses", RFC 4193, October 2005. 1064 [RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing 1065 Architecture", RFC 4291, February 2006. 1067 [RFC4443] Conta, A., Deering, S., and M. Gupta, "Internet Control 1068 Message Protocol (ICMPv6) for the Internet Protocol 1069 Version 6 (IPv6) Specification", RFC 4443, March 2006. 1071 [RFC4787] Audet, F. and C. Jennings, "Network Address Translation 1072 (NAT) Behavioral Requirements for Unicast UDP", BCP 127, 1073 RFC 4787, January 2007. 1075 11.2. Informative References 1077 [GSE] O'Dell, M., "GSE - An Alternate Addressing Architecture 1078 for IPv6", February 1997, 1079 . 1081 [I-D.ietf-behave-dns64] 1082 Bagnulo, M., Sullivan, A., Matthews, P., and I. Beijnum, 1083 "DNS64: DNS extensions for Network Address Translation 1084 from IPv6 Clients to IPv4 Servers", 1085 draft-ietf-behave-dns64-11 (work in progress), 1086 October 2010. 1088 [I-D.ietf-behave-v6v4-framework] 1089 Baker, F., Li, X., Bao, C., and K. Yin, "Framework for 1090 IPv4/IPv6 Translation", 1091 draft-ietf-behave-v6v4-framework-10 (work in progress), 1092 August 2010. 1094 [I-D.ietf-behave-v6v4-xlate] 1095 Li, X., Bao, C., and F. Baker, "IP/ICMP Translation 1096 Algorithm", draft-ietf-behave-v6v4-xlate-23 (work in 1097 progress), September 2010. 1099 [I-D.ietf-behave-v6v4-xlate-stateful] 1100 Bagnulo, M., Matthews, P., and I. Beijnum, "Stateful 1101 NAT64: Network Address and Protocol Translation from IPv6 1102 Clients to IPv4 Servers", 1103 draft-ietf-behave-v6v4-xlate-stateful-12 (work in 1104 progress), July 2010. 1106 [I-D.ietf-v6ops-ipv6-cpe-router] 1107 Singh, H., Beebee, W., Donley, C., Stark, B., and O. 1108 Troan, "Basic Requirements for IPv6 Customer Edge 1109 Routers", draft-ietf-v6ops-ipv6-cpe-router-09 (work in 1110 progress), December 2010. 1112 [NIST] NIST, "Draft NIST Framework and Roadmap for Smart Grid 1113 Interoperability, Release 1.0", September 2009. 1115 [RFC1071] Braden, R., Borman, D., Partridge, C., and W. Plummer, 1116 "Computing the Internet checksum", RFC 1071, 1117 September 1988. 1119 [RFC1624] Rijsinghani, A., "Computation of the Internet Checksum via 1120 Incremental Update", RFC 1624, May 1994. 1122 [RFC1918] Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G., and 1123 E. Lear, "Address Allocation for Private Internets", 1124 BCP 5, RFC 1918, February 1996. 1126 [RFC2629] Rose, M., "Writing I-Ds and RFCs using XML", RFC 2629, 1127 June 1999. 1129 [RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: 1130 Defeating Denial of Service Attacks which employ IP Source 1131 Address Spoofing", BCP 38, RFC 2827, May 2000. 1133 [RFC2993] Hain, T., "Architectural Implications of NAT", RFC 2993, 1134 November 2000. 1136 [RFC3424] Daigle, L. and IAB, "IAB Considerations for UNilateral 1137 Self-Address Fixing (UNSAF) Across Network Address 1138 Translation", RFC 3424, November 2002. 1140 [RFC3484] Draves, R., "Default Address Selection for Internet 1141 Protocol version 6 (IPv6)", RFC 3484, February 2003. 1143 [RFC4864] Van de Velde, G., Hain, T., Droms, R., Carpenter, B., and 1144 E. Klein, "Local Network Protection for IPv6", RFC 4864, 1145 May 2007. 1147 [RFC5245] Rosenberg, J., "Interactive Connectivity Establishment 1148 (ICE): A Protocol for Network Address Translator (NAT) 1149 Traversal for Offer/Answer Protocols", RFC 5245, 1150 April 2010. 1152 [RFC5389] Rosenberg, J., Mahy, R., Matthews, P., and D. Wing, 1153 "Session Traversal Utilities for NAT (STUN)", RFC 5389, 1154 October 2008. 1156 [RFC5766] Mahy, R., Matthews, P., and J. Rosenberg, "Traversal Using 1157 Relays around NAT (TURN): Relay Extensions to Session 1158 Traversal Utilities for NAT (STUN)", RFC 5766, April 2010. 1160 [RFC5902] Thaler, D., Zhang, L., and G. Lebovitz, "IAB Thoughts on 1161 IPv6 Network Address Translation", RFC 5902, July 2010. 1163 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 1164 Authentication Option", RFC 5925, June 2010. 1166 [RFC5996] Kaufman, C., Hoffman, P., Nir, Y., and P. Eronen, 1167 "Internet Key Exchange Protocol Version 2 (IKEv2)", 1168 RFC 5996, September 2010. 1170 [RFC6052] Bao, C., Huitema, C., Bagnulo, M., Boucadair, M., and X. 1171 Li, "IPv6 Addressing of IPv4/IPv6 Translators", RFC 6052, 1172 October 2010. 1174 [RFC6092] Woodyatt, J., "Recommended Simple Security Capabilities in 1175 Customer Premises Equipment (CPE) for Providing 1176 Residential IPv6 Internet Service", RFC 6092, 1177 January 2011. 1179 Appendix A. Why GSE? 1181 For the purpose of this discussion, let us over-simplify the 1182 Internet's structure by distinguishing between two broad classes of 1183 networks: transit and edge. A "transit network", in this context, is 1184 a network that provides connectivity services to other networks. Its 1185 AS number may show up in a non-final position in BGP AS paths, or in 1186 the case of mobile and residential broadband networks, it may offer 1187 network services to smaller networks that can't justify RIR 1188 membership. An "edge network", in contrast, is any network that is 1189 not a transit network; it is the ultimate customer, and while it 1190 provides internal connectivity for its own use, it is in other 1191 respects a consumer of transit services. In terms of routing, a 1192 network in the transit domain generally needs some way to make 1193 choices about how it routes to other networks; an edge network is 1194 generally quite satisfied with a simple default route. 1196 The [GSE] proposal, and as a result this proposal (which is similar 1197 to GSE in most respects and inspired by it), responds directly to 1198 current concerns in the RIR communities. Edge networks are used to 1199 an environment in IPv4 in which their addressing is disjoint from 1200 that of their upstream transit networks; it is either provider 1201 independent, or a network prefix translator makes their external 1202 address distinct from their internal address, and they like the 1203 distinction. In IPv6, there is a mantra that edge network addresses 1204 should be derived from their upstream, and if they have multiple 1205 upstreams, edge networks are expected to design their networks to use 1206 all of those prefixes equivalently. They see this as unnecessary and 1207 unwanted operational complexity, and are as a result pushing very 1208 hard in the RIR communities for provider independent addressing. 1210 Widespread use of provider independent addressing has a natural and 1211 perhaps unavoidable side-effect that is likely to be very expensive 1212 in the long term. It means that the routing table will enumerate the 1213 networks at the edge of the transit domain, the edge networks, rather 1214 than enumerating the transit domain. Per the BGP Update Report of 17 1215 December 2010, there are currently over 36,000 Autonomous Systems 1216 being advertised in BGP, of which over 15,000 advertise only one 1217 prefix. There are in the neighborhood of 5000 AS's that show up in a 1218 non-final position in AS paths, and perhaps another 5000 networks 1219 whose AS numbers are terminal in more than one AS path. In other 1220 words, we have prefixes for some 36,000 transit and edge networks in 1221 the route table now, many of which arguably need an Autonomous System 1222 number only for multihoming. Current estimates suggest that we could 1223 easily see that be on the order of 10,000,000 within fifteen years. 1224 However, the vast majority of networks (2/3) having the tools 1225 necessary to multihome are not visibly doing so, and would be well 1226 served by any solution that gives them address independence without 1227 the overhead of RIR membership and BGP routing. 1229 Current growth estimates suggest that we could easily see that be on 1230 the order of 10,000,000 within fifteen years. Tens of thousands of 1231 entries in the route table is very survivable; while our protocols 1232 and computers will likely do quite well with tens of millions of 1233 routes, the heat produced and power consumed by those routers, and 1234 the inevitable impact on the cost of those routers, is not a good 1235 outcome. To avoid having a massive and unscalable route table, we 1236 need to find a way that is politically acceptable and returns us to 1237 enumerating the transit domain, not the edge. 1239 There have been a number of proposals. As described, shim6 moves the 1240 complexity to the edge, and the edge is rebelling. Geographic 1241 addressing in essence forces ISPs to "own" geographic territory from 1242 a routing perspective, as otherwise there is no clue in the address 1243 as to what network a datagram should be delivered to in order to 1244 reach it. Metropolitan Addressing can imply regulatory authority, 1245 and even if it is implemented using internet exchange consortia, 1246 visits a great deal of complexity on the transit networks that 1247 directly serve the edge. The one that is likely to be most 1248 acceptable is any proposal that enables an edge network to be 1249 operationally independent of its upstreams, with no obligation to 1250 renumber when it adds, drops, or changes ISPs, and with no additional 1251 burden placed either on the ISP or the edge network as a result. 1252 From an application perspective, an additional operational 1253 requirement in the words of Roadmap for the Smart Grid [NIST], is 1254 that 1256 "...the Network should enable an application in a particular 1257 domain to communicate with an application in any other domain in 1258 the information network, with proper management control over who 1259 and where applications can be interconnected." 1261 In other words, the structure of the network should allow for and 1262 enable appropriate access control, but the structure of the network 1263 should not inherently limit access. 1265 The GSE model, by statelessly translating the prefix between an edge 1266 network and its upstream transit network, accomplishes that with a 1267 minimum of fuss and bother. Stated in the simplest terms, it enables 1268 the edge network to behave as if it has a provider independent prefix 1269 from a multihoming and renumbering perspective without the overhead 1270 of RIR membership or maintaining BGP connectivity, and it enables the 1271 transit networks to aggressively aggregate what are from their 1272 perspective provider-allocated customer prefixes, to maintain a 1273 rational-sized routing table. 1275 Appendix B. Verification code 1277 This non-normative appendix is presented as a proof of concept. It 1278 is in no sense optimized; for example, one's complement arithmetic is 1279 implemented in portable subroutines, where operational 1280 implementations might use one's complement arithmetic instructions 1281 through a pragma; such implementations probably need to explicitly 1282 force 0xFFFF to 0x0000, as the instruction will not. The original 1283 purpose of the code was to verify whether or not it was necessary to 1284 suppress 0xFFFF by overwriting with zero, and whether predicted 1285 issues with subnet numbering were real. 1287 The point is to 1289 o demonstrate that if one or the other representation of zero is not 1290 used in the word the checksum is updated in, the program maps 1291 inner and outer addresses in a manner that is, mathematically, 1:1 1292 and onto (each inner address maps to a unique outer address, and 1293 that outer address maps back to exactly the same inner address), 1294 and 1296 o give guidance on the suppression of 0xFFFF checksums. 1298 In short, in one's complement arithmetic, x-x=0, but will take the 1299 negative representation of zero. If 0xFFFF results are forced to the 1300 value 0x0000, as is recommended in [RFC1071], the word the checksum 1301 is adjusted in cannot be initially 0xFFFF, as on the return it will 1302 be forced to 0. If 0xFFFF results are not forced to the value 0x0000 1303 as is recommended in [RFC1071], the word the checksum is adjusted in 1304 cannot be initially 0, as on the return it will be calculated as 1305 0+(~0) = 0xFFFF. We chose to follow [RFC1071]'s recommendations, 1306 which implies a requirement to not use 0xFFFF as a subnet number in 1307 networks with a /48 external prefix. 1309 /* 1310 * Copyright (c) 2010 IETF Trust and the persons identified as 1311 * authors of the code. All rights reserved. Redistribution 1312 * and use in source and binary forms, with or without 1313 * modification, are permitted provided that the following 1314 * conditions are met: 1315 * 1316 * o Redistributions of source code must retain the above 1317 * copyright notice, this list of conditions and the 1318 * following disclaimer. 1319 * 1320 * o Redistributions in binary form must reproduce the above 1321 * copyright notice, this list of conditions and the 1322 * following disclaimer in the documentation and/or other 1323 * materials provided with the distribution. 1324 * 1325 * o Neither the name of Internet Society, IETF or IETF Trust, 1326 * nor the names of specific contributors, may be used to 1327 * endorse or promote products derived from this software 1328 * without specific prior written permission. 1329 * 1330 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND 1331 * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, 1332 * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 1333 * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 1334 * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR 1335 * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 1336 * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT 1337 * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 1338 * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 1339 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 1340 * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR 1341 * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 1342 * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 1343 */ 1344 #include "stdio.h" 1345 #include "assert.h" 1346 /* 1347 * program to verify the NPTv6 algorithm 1348 * 1349 * argument: 1350 * perform negative zero suppression: boolean 1351 * 1352 * method: 1353 * We specify an internal and an external prefix. The prefix 1354 * length is presumed to be the common length of both, and for 1355 * this is a /48. We perform the three algorithms specified. 1356 * the "datagram" address is in effect the source address 1357 * internal->external and the destination address 1358 * external->internal. 1359 */ 1360 unsigned short inner_init[] = { 1361 0xFD01, 0x0203, 0x0405, 1, 2, 3, 4, 5}; 1362 unsigned short outer_init[] = { 1363 0x2001, 0x0db8, 0x0001, 1, 2, 3, 4, 5}; 1364 unsigned short inner[8]; 1365 unsigned short datagram[8]; 1366 unsigned char checksum[65536] = {0}; 1367 unsigned short outer[8]; 1368 unsigned short adjustment; 1369 unsigned short suppress; 1370 /* 1371 * One's complement sum. 1372 * return number1 + number2 1373 */ 1374 unsigned short 1375 add1(number1, number2) 1376 unsigned short number1; 1377 unsigned short number2; 1378 { 1379 unsigned int result; 1381 result = number1; 1382 result += number2; 1383 if (suppress) { 1384 while (0xFFFF <= result) { 1385 result = result + 1 - 0x10000; 1386 } 1387 } else { 1388 while (0xFFFF < result) { 1389 result = result + 1 - 0x10000; 1390 } 1391 } 1392 return result; 1393 } 1395 /* 1396 * One's complement difference 1397 * return number1 - number2 1398 */ 1399 unsigned short 1400 sub1(number1, number2) 1401 unsigned short number1; 1402 unsigned short number2; 1403 { 1404 return add1(number1, ~number2); 1405 } 1407 /* 1408 * return one's complement sum of an array of numbers 1409 */ 1410 unsigned short 1411 sum1(numbers, count) 1412 unsigned short *numbers; 1413 int count; 1414 { 1415 unsigned int result; 1417 result = *numbers++; 1418 while (--count > 0) { 1419 result += *numbers++; 1420 } 1422 if (suppress) { 1423 while (0xFFFF <= result) { 1424 result = result + 1 - 0x10000; 1425 } 1426 } else { 1427 while (0xFFFF < result) { 1428 result = result + 1 - 0x10000; 1429 } 1430 } 1431 return result; 1432 } 1434 /* 1435 * NPTv6 initialization: section 3.1 assuming section 3.4 1436 * 1437 * create the /48, a source address in internal format, and a 1438 * source address in external format. calculate the adjustment 1439 * if one /48 is overwritten with the other. 1440 */ 1441 void 1442 nptv6_initialization(subnet) 1443 unsigned short subnet; 1444 { 1445 int i; 1446 unsigned short inner48; 1447 unsigned short outer48; 1449 /* initialize the internal and external prefixes. */ 1450 for (i = 0; i < 8; i++) { 1451 inner[i] = inner_init[i]; 1452 outer[i] = outer_init[i]; 1453 } 1454 inner[3] = subnet; 1455 outer[3] = subnet; 1456 /* calculate the checksum adjustment */ 1457 inner48 = sum1(inner, 3); 1458 outer48 = sum1(outer, 3); 1459 adjustment = sub1(inner48, outer48); 1460 } 1462 /* 1463 * NPTv6 datagram from edge to transit: section 3.2 assuming 1464 * section 3.4 1465 * 1466 * overwrite the prefix in the source address with the outer 1467 * prefix, and adjust the checksum 1468 */ 1469 void 1470 nptv6_inner_to_outer() 1471 { 1472 int i; 1474 /* let's get the source address into the datagram */ 1475 for (i = 0; i < 8; i++) { 1476 datagram[i] = inner[i]; 1477 } 1479 /* overwrite the prefix with the outer prefix */ 1480 for (i = 0; i < 3; i++) { 1481 datagram[i] = outer[i]; 1482 } 1484 /* adjust the checksum */ 1485 datagram[3] = add1(datagram[3], adjustment); 1486 } 1488 /* 1489 * NPTv6 datagram from transit to edge:: section 3.3 assuming 1490 * section 3.4 1491 * 1492 * overwrite the prefix in the destination address with the 1493 * inner prefix, and adjust the checksum 1494 */ 1495 void 1496 nptv6_outer_to_inner() 1497 { 1498 int i; 1500 /* overwrite the prefix with the outer prefix */ 1501 for (i = 0; i < 3; i++) { 1502 datagram[i] = inner[i]; 1503 } 1505 /* adjust the checksum */ 1506 datagram[3] = sub1(datagram[3], adjustment); 1507 } 1509 /* 1510 * main program 1511 */ 1512 main(argc, argv) 1513 int argc; 1514 char **argv; 1515 { 1516 unsigned subnet; 1517 int i; 1519 if (argc < 2) { 1520 fprintf(stderr, "usage: nptv6 supression\n"); 1521 assert(0); 1522 } 1523 suppress = atoi(argv[1]); 1524 assert(suppress <= 1); 1526 for (subnet = 0; subnet < 0x10000; subnet++) { 1527 /* section 3.1: initialize the system */ 1528 nptv6_initialization(subnet); 1530 /* section 3.2: take a datagram from inside to outside */ 1531 nptv6_inner_to_outer(); 1533 /* the resulting checksum value should be unique */ 1534 if (checksum[subnet]) { 1535 printf("inner->outer duplicated checksum: " 1536 "inner: %x:%x:%x:%x:%x:%x:%x:%x(%x) " 1537 "calculated: %x:%x:%x:%x:%x:%x:%x:%x(%x)\n", 1538 inner[0], inner[1], inner[2], inner[3], 1539 inner[4], inner[5], inner[6], inner[7], 1540 sum1(inner, 8), datagram[0], datagram[1], 1541 datagram[2], datagram[3], datagram[4], 1542 datagram[5], datagram[6], datagram[7], 1543 sum1(datagram, 8)); 1544 } 1546 checksum[subnet] = 1; 1548 /* 1549 * the resulting checksum should be the same as the inner 1550 * address's checksum 1551 */ 1552 if (sum1(datagram, 8) != sum1(inner, 8)) { 1553 printf("inner->outer incorrect: " 1554 "inner: %x:%x:%x:%x:%x:%x:%x:%x(%x) " 1555 "calculated: %x:%x:%x:%x:%x:%x:%x:%x(%x)\n", 1556 inner[0], inner[1], inner[2], inner[3], 1557 inner[4], inner[5], inner[6], inner[7], 1558 sum1(inner, 8), 1559 datagram[0], datagram[1], datagram[2], datagram[3], 1560 datagram[4], datagram[5], datagram[6], datagram[7], 1561 sum1(datagram, 8)); 1562 } 1564 /* section 3.3: take a datagram from outside to inside */ 1565 nptv6_outer_to_inner(); 1567 /* 1568 * the returning datagram should have the same checksum it 1569 * left with 1570 */ 1571 if (sum1(datagram, 8) != sum1(inner, 8)) { 1572 printf("outer->inner checksum incorrect: " 1573 "calculated: %x:%x:%x:%x:%x:%x:%x:%x(%x) " 1574 "inner: %x:%x:%x:%x:%x:%x:%x:%x(%x)\n", 1575 datagram[0], datagram[1], datagram[2], datagram[3], 1576 datagram[4], datagram[5], datagram[6], datagram[7], 1577 sum1(datagram, 8), inner[0], inner[1], inner[2], 1578 inner[3], inner[4], inner[5], inner[6], inner[7], 1579 sum1(inner, 8)); 1580 } 1582 /* 1583 * and every octet should calculate back to the same inner 1584 * value 1585 */ 1586 for (i = 0; i < 8; i++) { 1587 if (inner[i] != datagram[i]) { 1588 printf("outer->inner different: " 1589 "calculated: %x:%x:%x:%x:%x:%x:%x:%x " 1590 "inner: %x:%x:%x:%x:%x:%x:%x:%x\n", 1591 datagram[0], datagram[1], datagram[2], 1592 datagram[3], datagram[4], datagram[5], 1593 datagram[6], datagram[7], inner[0], inner[1], 1594 inner[2], inner[3], inner[4], inner[5], 1595 inner[6], inner[7]); 1596 break; 1597 } 1598 } 1599 } 1600 } 1602 Authors' Addresses 1604 Margaret Wasserman 1605 Painless Security 1606 North Andover, MA 01845 1607 USA 1609 Phone: +1 781 405 7464 1610 Email: mrw@painless-security.com 1611 URI: http://www.painless-security.com 1612 Fred Baker 1613 Cisco Systems 1614 Santa Barbara, California 93117 1615 USA 1617 Phone: +1-408-526-4257 1618 Email: fred@cisco.com