idnits 2.17.1 draft-mrw-nat66-12.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC2119]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 1319 has weird spacing: '...d short inner...' == Line 1321 has weird spacing: '...d short outer...' == Line 1323 has weird spacing: '...d short inner...' == Line 1324 has weird spacing: '...d short packe...' == Line 1325 has weird spacing: '...ed char chec...' == (12 more instances...) -- The document date (March 14, 2011) is 4791 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Experimental ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '3' on line 1551 -- Looks like a reference, but probably isn't: '1' on line 1551 -- Looks like a reference, but probably isn't: '0' on line 1551 -- Looks like a reference, but probably isn't: '2' on line 1551 -- Looks like a reference, but probably isn't: '4' on line 1552 -- Looks like a reference, but probably isn't: '5' on line 1552 -- Looks like a reference, but probably isn't: '6' on line 1552 -- Looks like a reference, but probably isn't: '7' on line 1552 -- Obsolete informational reference (is this intentional?): RFC 2629 (Obsoleted by RFC 7749) -- Obsolete informational reference (is this intentional?): RFC 3484 (Obsoleted by RFC 6724) -- Obsolete informational reference (is this intentional?): RFC 5996 (Obsoleted by RFC 7296) Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 13 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Wasserman 3 Internet-Draft Painless Security 4 Intended status: Experimental F. Baker 5 Expires: September 15, 2011 Cisco Systems 6 March 14, 2011 8 IPv6-to-IPv6 Network Prefix Translation 9 draft-mrw-nat66-12 11 Abstract 13 This document describes a stateless, transport-agnostic IPv6-to-IPv6 14 Network Prefix Translation (NPTv6) function that provides the address 15 independence benefit associated with IPv4-to-IPv4 NAT (NAPT44), and 16 in addition provides a 1:1 relationship between addresses in the 17 "inside" and "outside" prefixes, preserving end to end reachability 18 at the network layer. 20 Requirements Terminology 22 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 23 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 24 document are to be interpreted as described in RFC 2119 [RFC2119]. 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on September 15, 2011. 43 Copyright Notice 45 Copyright (c) 2011 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 1.1. What is Address Independence? . . . . . . . . . . . . . . 5 62 1.2. NPTv6 Applicability . . . . . . . . . . . . . . . . . . . 6 63 2. NPTv6 Overview . . . . . . . . . . . . . . . . . . . . . . . . 7 64 2.1. NPTv6: the simplest case . . . . . . . . . . . . . . . . 8 65 2.2. NPTv6 between peer networks . . . . . . . . . . . . . . . 9 66 2.3. NPTv6 redundancy and load-sharing . . . . . . . . . . . . 9 67 2.4. NPTv6 multihoming . . . . . . . . . . . . . . . . . . . . 10 68 2.5. Mapping with No Per-Flow State . . . . . . . . . . . . . 10 69 2.6. Checksum-Neutral Mapping . . . . . . . . . . . . . . . . 11 70 3. NPTv6 Algorithmic Specification . . . . . . . . . . . . . . . 11 71 3.1. NPTv6 configuration calculations . . . . . . . . . . . . 11 72 3.2. NPTv6 translation, internal network to external 73 network . . . . . . . . . . . . . . . . . . . . . . . . . 12 74 3.3. NPTv6 translation, external network to internal 75 network . . . . . . . . . . . . . . . . . . . . . . . . . 12 76 3.4. NPTv6 with a /48 or shorter prefix . . . . . . . . . . . 13 77 3.5. NPTv6 with a /49 or longer prefix . . . . . . . . . . . . 13 78 3.6. /48 Prefix Mapping Example . . . . . . . . . . . . . . . 13 79 3.7. Address Mapping for Longer Prefixes . . . . . . . . . . . 14 80 4. Implications of Network Address Translator Behavioral 81 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 15 82 4.1. Prefix configuration and generation . . . . . . . . . . . 15 83 4.2. Subnet numbering . . . . . . . . . . . . . . . . . . . . 15 84 4.3. NAT Behavioral Requirements . . . . . . . . . . . . . . . 15 85 5. Implications for Applications . . . . . . . . . . . . . . . . 16 86 5.1. Recommendation for network planners considering use 87 of NPTv6 Translation . . . . . . . . . . . . . . . . . . 18 88 5.2. Recommendations for application writers . . . . . . . . . 18 89 5.3. Recommendation for future work . . . . . . . . . . . . . 18 90 6. A Note on Port Mapping . . . . . . . . . . . . . . . . . . . . 18 91 7. Security Considerations . . . . . . . . . . . . . . . . . . . 19 92 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 93 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 20 94 10. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . 20 95 10.1. Changes Between draft-mrw-behave-nat66-00 and -01 . . . . 20 96 10.2. Changes between *behave-nat66-01 and -02 . . . . . . . . 21 97 10.3. Changes between *nat66-00 and *nat66-01 . . . . . . . . . 21 98 10.4. Changes between *nat66-01 and *nat66-02 . . . . . . . . . 21 99 10.5. Changes between *nat66-02 and *nat66-03 . . . . . . . . . 22 100 10.6. Changes between *nat66-03 and *nat66-04 . . . . . . . . . 22 101 10.7. Changes between *nat66-04 and *nat66-05 . . . . . . . . . 22 102 10.8. Changes between *nat66-05 and *nat66-06 . . . . . . . . . 22 103 10.9. Changes between *nat66-06 and *nat66-07 . . . . . . . . . 22 104 10.10. Changes between *nat66-07 and *nat66-08 . . . . . . . . . 23 105 10.11. Changes up to *nat66-10 . . . . . . . . . . . . . . . . . 23 106 10.12. Changes up to *nat66-11 and -12 . . . . . . . . . . . . . 23 107 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23 108 11.1. Normative References . . . . . . . . . . . . . . . . . . 23 109 11.2. Informative References . . . . . . . . . . . . . . . . . 23 110 Appendix A. Why GSE? . . . . . . . . . . . . . . . . . . . . . . 25 111 Appendix B. Verification code . . . . . . . . . . . . . . . . . . 27 112 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 34 114 1. Introduction 116 This document describes a stateless IPv6-to-IPv6 Network Prefix 117 Translation (NPTv6) function, designed to provide address 118 independence to the edge network. It is transport-agnostic with 119 respect to transports that don't checksum the IP header, such as 120 SCTP, and to transports that use the TCP/UDP/DCCP pseudo-header and 121 checksum [RFC1071]. 123 This has several ramifications: 125 o Any security benefit that NAPT44 might offer is not present in 126 NPTv6, necessitating the use of a firewall to obtain those 127 benefits if desired. An example of such a firewall is described 128 in [RFC6092]. 130 o End to end reachability is preserved, although the address used 131 "inside" the edge network differs from the address used "outside" 132 the edge network. This has implications for application referrals 133 and other uses of Internet layer addresses. 135 o If there are multiple identically-configured prefix translators 136 between two networks, there is no need for them to exchange 137 dynamic state, as there is no dynamic state - the algorithmic 138 translation will be identical across each of them. The network 139 can therefore asymmetrically route, load-share, and fail-over 140 among them without issue. 142 o Since translation is 1:1 at the network layer, there is no need to 143 modify port numbers or other transport parameters. 145 o TCP sessions that authenticate peers using the TCP Authentication 146 Option [RFC5925] cannot have their addresses translated, as the 147 addresses are used in the calculation of the Message 148 Authentication Code. This consideration applies in general to any 149 UNilateral Self-Address Fixing (UNSAF) [RFC3424] Protocol, which 150 the IAB recommends against the deployment of in an environment 151 that changes Internet addresses. 153 o Applications using the Internet Key Exchange Protocol Version 2 154 (IKEv2) [RFC5996] should, at least in theory, detect the presence 155 of the translator; while no NAT traversal solution is required, 156 [RFC5996] would require such sessions to use UDP. 158 1.1. What is Address Independence? 160 For the purposes of this document, IPv6 Address Independence consists 161 of the following set of properties: 163 From the perspective of the edge network: 165 * The IPv6 addresses used inside the local network (for 166 interfaces, access lists, and logs) do not need to be 167 renumbered if the global prefix(es) assigned for use by the 168 edge network are changed. 170 * The IPv6 addresses used inside the edge network (for 171 interfaces, access lists, and logs) or within other upstream 172 networks (such as when multihoming) do not need to be 173 renumbered when a site adds, drops, or changes upstream 174 networks. 176 * It is not necessary for an administration to convince an 177 upstream network to route its internal IPv6 prefixes, or for it 178 to advertise prefixes derived from other upstream networks into 179 it. 181 * Unless it wants to optimize routing between multiple upstream 182 networks in the process of multihoming, there is therefore no 183 need for a BGP exchange with the upstream network. 185 From the perspective of the upstream network: 187 * IPv6 addresses used by the edge network are guaranteed to have 188 a provider-allocated prefix, eliminating the need and concern 189 for BCP 38 [RFC2827] ingress filtering and the advertisement of 190 customer-specific prefixes. 192 Thus, address independence has ramifications for the edge network, 193 networks it directly connects with (especially its upstream 194 networks), and for the Internet as a whole. The desire for address 195 independence has been a primary driver for IPv4 NAT deployment in 196 medium to large-sized enterprise networks, including NAT deployments 197 in enterprises that have plenty of IPv4 provider independent address 198 space (from IPv4 "swamp space"). It has also been a driver for edge 199 networks to become members of Regional Internet Registry (RIR) 200 communities, seeking to obtain BGP Autonomous System Numbers and 201 provider independent prefixes, and as a result has been one of the 202 drivers of the explosion of the IPv4 route table. Service providers 203 have stated that the lack of address independence from their 204 customers has been a negative incentive to deployment, due to the 205 impact of customer routing expected in their networks. 207 The Local Network Protection [RFC4864] document discusses a related 208 concept called "Address Autonomy" as a benefit of NAPT44. [RFC4864] 209 indicates that address autonomy can be achieved by the simultaneous 210 use of global addresses on all nodes within a site that need external 211 connectivity, and Unique Local Addresses (ULAs) [RFC4193] for all 212 internal communication. However, this solution fails to meet the 213 requirement for address independence, because if an ISP renumbering 214 event occurs, all of the hosts, routers, DHCP servers, ACLs, 215 firewalls and other internal systems that are configured with global 216 addresses from the ISP will need to be renumbered before global 217 connectivity is fully restored. 219 The use of IPv6 Provider Independent (PI) addresses has also been 220 suggested as a means to fulfill the address independence requirement. 221 However, this solution requires that an enterprise qualify to receive 222 a PI assignment and persuade their ISP to install specific routes for 223 the enterprise's PI addresses. There are a number of practical 224 issues with this approach, especially if there is a desire to route 225 to a number of geographically and topologically diverse set of sites, 226 which can sometimes involve coordinating with several ISPs to route 227 portions of a single PI prefix. These problems have caused numerous 228 enterprises with plenty of IPv4 swamp space to choose to use IPv4 NAT 229 for part, or substantially all, of their internal network instead of 230 using their provider independent address space. 232 1.2. NPTv6 Applicability 234 NPTv6 provides a simple and compelling solution to meet the Address 235 Independence requirement in IPv6. The address independence benefit 236 stems directly from the translation function of the network prefix 237 translator. To avoid as many of the issues associated with NAPT44 as 238 possible, NPTv6 is defined to include a two-way, checksum-neutral, 239 algorithmic translation function, and nothing else. 241 The fact that NPTv6 does not map ports and is checksum-neutral avoids 242 the need for an NPTv6 Translator to re-write transport layer headers. 243 This makes it feasible to deploy new or improved transport layer 244 protocols without upgrading NPTv6 Translators. Similarly, since 245 NPTv6 does not re-write transport layer headers, NPTv6 will not 246 interfere with encryption of the full IP payload in many cases. 248 The default NPTv6 address mapping mechanism is purely algorithmic, so 249 NPTv6 translators do not need to maintain per-node or per-connection 250 state, allowing deployment of more robust and adaptive networks than 251 can be deployed using NAPT44. Since the default NPTv6 mapping can be 252 performed in either direction, it does not interfere with inbound 253 connection establishment, thus allowing internal nodes to participate 254 in direct Peer-to-Peer applications without the application layer 255 overhead one finds in many IPv4 Peer-to-Peer applications. 257 Although NPTv6 compares favorably to NAPT44 in several ways, it does 258 not eliminate all of the architectural problems associated with IPv4 259 NAT, as described in [RFC2993]. NPTv6 involves modifying IP headers 260 in transit, so it is not compatible with security mechanisms, such as 261 the IPsec Authentication Header, that provide integrity protection 262 for the IP header. NPTv6 may interfere with the use of application 263 protocols that transmit IP addresses in the application-specific 264 portion of the IP packet. These applications currently require 265 application layer gateways (ALGs) to work correctly through NAPT44 266 devices, and similar ALGs may be required for these applications to 267 work through NPTv6 Translators. The use of separate internal and 268 external prefixes creates complexity for DNS deployment, due to the 269 desire for internal nodes to communicate with other internal nodes 270 using internal addresses, while external nodes need to obtain 271 external addresses to communicate with the same nodes. This 272 frequently results in the deployment of "split DNS", which may add 273 complexity to network configuration. 275 The choice of address within the edge network bears consideration. 276 One could use a ULA, which maximizes address independence. That 277 could also be considered a misuse of the ULA; if the expectation is 278 that a ULA prevents access to a system from outside the range of the 279 ULA, NPTv6 overrides that. On the other hand, the administration is 280 aware that it has made that choice, and could if it desired deploy a 281 second ULA for the purpose of privacy; the only prefix that will be 282 translated is one that has an NPTv6 Translator configured to 283 translate to or from it. Also, using any other global scope address 284 format makes one either obtain a PI prefix or be at the mercy of the 285 agency from which it was allocated. 287 There are significant technical impacts associated with the 288 deployment of any prefix translation mechanism, including NPTv6, and 289 we strongly encourage anyone who is considering the implementation or 290 deployment of NPTv6 to read [RFC4864] and [RFC5902], and to carefully 291 consider the alternatives described in that document, some of which 292 may cause fewer problems than NPTv6. 294 2. NPTv6 Overview 296 NPTv6 may be implemented in an IPv6 router to map one IPv6 address 297 prefix to another IPv6 prefix as each IPv6 packet transits the 298 router. A router that implements an NPTv6 prefix translation 299 function is referred to as an NPTv6 Translator. 301 2.1. NPTv6: the simplest case 303 In its simplest form, an NPTv6 Translator interconnects two network 304 links, one of which is an "internal" network link attached to a leaf 305 network within a single administrative domain, and the other of which 306 is an "external" network with connectivity to the global Internet. 307 All of the hosts on the internal network will use addresses from a 308 single, locally-routed prefix, and those addresses will be translated 309 to/from addresses in a globally-routable prefix as IP packets transit 310 the NPTv6 Translator. The lengths of these two prefixes will be 311 functionally the same; if they differ, the longer of the two will 312 limit the ability to use subnets in the shorter. 314 External Network: Prefix = 2001:0DB8:0001:/48 315 -------------------------------------- 316 | 317 | 318 +-------------+ 319 | NPTv6 | 320 | Translator | 321 +-------------+ 322 | 323 | 324 -------------------------------------- 325 Internal Network: Prefix = FD01:0203:0405:/48 327 Figure 1: A simple translator 329 Figure 1 shows an NPTv6 Translator attached to two networks. In this 330 example, the internal network uses IPv6 Unique Local Addresses (ULAs) 331 [RFC4193] to represent the internal IPv6 nodes, and the external 332 network uses globally routable IPv6 addresses to represent the same 333 nodes. 335 When an NPTv6 Translator forwards packets in the "outbound" 336 direction, from the internal network to the external network, NPTv6 337 overwrites the IPv6 source prefix (in the IPv6 header) with a 338 corresponding external prefix. When packets are forwarded in the 339 "inbound" direction, from the external network to the internal 340 network, the IPv6 destination prefix is overwritten with a 341 corresponding internal prefix. Using the prefixes shown in the 342 diagram above, as an IP packet passes through the NPTv6 Translator in 343 the outbound direction, the source prefix (FD01:0203:0405:/48) will 344 be overwritten with the external prefix (2001:0DB8:0001:/48). In an 345 inbound packet, the destination prefix (2001:0DB8:0001:/48) will be 346 overwritten with the internal prefix (FD01:0203:0405:/48). In both 347 cases, it is the local IPv6 prefix that is overwritten; the remote 348 IPv6 prefix remains unchanged. Nodes on the internal network are 349 said to be "behind" the NPTv6 Translator. 351 2.2. NPTv6 between peer networks 353 NPTv6 can also be used between two private networks. In these cases, 354 both networks may use ULA prefixes, with each subnet in one network 355 mapped into a corresponding subnet in the other network, and vice 356 versa. Or, each network may use ULA prefixes for internal 357 addressing, and global unicast addresses on the other network. 359 Internal Prefix = FD01:4444:5555:/48 360 -------------------------------------- 361 V | External Prefix 362 V | 2001:0DB8:6666:/48 363 V +---------+ ^ 364 V | NPTv6 | ^ 365 V | Device | ^ 366 V +---------+ ^ 367 External Prefix | ^ 368 2001:0DB8:0001:/48 | ^ 369 -------------------------------------- 370 Internal Prefix = FD01:0203:0405:/48 372 Figure 2: Flow of Information in Translation 374 2.3. NPTv6 redundancy and load-sharing 376 In some cases, more than one NPTv6 Translator may be attached to a 377 network, as shown in Figure 3. In such cases, NPTv6 Translators are 378 configured with the same internal and external prefixes. Since there 379 is only one translation, even though there are multiple translators, 380 they map only one external address (prefix and IID) to the internal 381 address. 383 External Network: Prefix = 2001:0DB8:0001:/48 384 -------------------------------------- 385 | | 386 | | 387 +-------------+ +-------------+ 388 | NPTv6 | | NPTv6 | 389 | Translator | | Translator | 390 | #1 | | #2 | 391 +-------------+ +-------------+ 392 | | 393 | | 394 -------------------------------------- 395 Internal Network: Prefix = FD01:0203:0405:/48 396 Figure 3: Parallel Translators 398 2.4. NPTv6 multihoming 400 External Network #1: External Network #2: 401 Prefix = 2001:0DB8:0001:/48 Prefix = 2001:0DB8:5555:/48 402 --------------------------- -------------------------- 403 | | 404 | | 405 +-------------+ +-------------+ 406 | NPTv6 | | NPTv6 | 407 | Translator | | Translator | 408 | #1 | | #2 | 409 +-------------+ +-------------+ 410 | | 411 | | 412 -------------------------------------- 413 Internal Network: Prefix = FD01:0203:0405:/48 415 Figure 4: Parallel Translators with different upstream networks 417 When multihoming, NPTv6 Translators are attached to an internal 418 network, as shown in Figure 4, but connected to different external 419 networks. In such cases, NPTv6 Translators are configured with the 420 same internal prefix, but different external prefixes. Since there 421 are multiple translations, they map multiple external addresses 422 (prefix and IID) to the common internal address. A system within the 423 edge network is unable to determine which external address it is 424 using apart from services such as STUN. 426 Multihoming in this sense has one negative feature as compared with 427 multihoming with a provider independent address; when routes change 428 between NPTv6 Translators, since the upstream network changes, the 429 translated prefix can change. This would cause sessions and 430 referrals dependent on it to fail as well. This is not expected to 431 be a major issue, however, in networks where routing is generally 432 stable. 434 2.5. Mapping with No Per-Flow State 436 When NPTv6 is used as described in this document, no per-node or per- 437 flow state is maintained in the NPTv6 Translator. Both inbound and 438 outbound packets are translated algorithmically, using only 439 information found in the IPv6 header. Due to this property, NPTv6's 440 two-way, algorithmic address mapping can support both outbound and 441 inbound connection establishment without the need for state-priming 442 or rendezvous mechanisms, or the maintenance of mapping state. This 443 is a significant improvement over NAPT44 devices, but it also has 444 significant security implications which are described in Section 7. 446 2.6. Checksum-Neutral Mapping 448 When a change is made to one of the IP header fields in the IPv6 449 pseudo-header checksum (such as one of the IP addresses), the 450 checksum field in the transport layer header may become invalid. 451 Fortunately, an incremental change in the area covered by the 452 Internet standard checksum [RFC1071] will result in a well-defined 453 change to the checksum value [RFC1624]. So, a checksum change caused 454 by modifying part of the area covered by the checksum can be 455 corrected by making a complementary change to a different 16-bit 456 field covered by the same checksum. 458 The NPTv6 mapping mechanisms described in this document are checksum- 459 neutral, which means that they result in IP headers that will 460 generate the same IPv6 pseudo-header checksum when the checksum is 461 calculated using the standard Internet checksum algorithm [RFC1071]. 462 Any changes that are made during translation of the IPv6 prefix are 463 offset by changes to other parts of the IPv6 address. This results 464 in transport layers that use the Internet checksum (such as TCP and 465 UDP) calculating the same IPv6 pseudo header checksum for both the 466 internal and external forms of the same packet, which avoids the need 467 for the NPTv6 Translator to modify those transport layer headers to 468 correct the checksum value. 470 As noted in Section 4.2, this mapping results in an edge network 471 using a /48 external prefix to be unable to use subnet 0xFFFF. 473 3. NPTv6 Algorithmic Specification 475 The [RFC4291] IPv6 Address is reproduced for clarity in Figure 5. 477 0 15 16 31 32 47 48 63 64 79 80 95 96 111 112 127 478 +-------+-------+-------+-------+-------+-------+-------+-------+ 479 | Routing Prefix | Subnet| Interface Identifier (IID) | 480 +-------+-------+-------+-------+-------+-------+-------+-------+ 482 Figure 5: Enumeration of the IPv6 Address [RFC4291] 484 3.1. NPTv6 configuration calculations 486 When an NPTv6 Translation function is configured, it is configured 487 with 489 o one or more "internal" interfaces with their "internal" routing 490 domain prefixes, and 492 o one or more "external" interfaces with their "external" routing 493 domain prefixes. 495 In the simple case, there is one of each. If a single router 496 provides NPTv6 translation services between a multiplicity of domains 497 (as might be true when multihoming), each internal/external pair must 498 be thought of as a separate NPTv6 Translator from the perspective of 499 this specification. 501 When an NPTv6 Translator is configured, the translation function 502 first ensures that the internal and external prefixes are the same 503 length, if necessary by extending the shorter of the two with zeroes. 504 These two prefixes will be used in the prefix translation function 505 described in Section 3.2 and Section 3.3. 507 They are then zero-extended to /64, for the purposes of a 508 calculation. The translation function calculates the ones-complement 509 sum of the 16 bit words of the /64 external prefix and the /64 510 internal prefix. It then calculates the difference between these 511 values: internal minus external. This value, called the 512 "adjustment", is effectively constant for the lifetime of the NPTv6 513 Translator configuration, and used in per-packet processing. 515 3.2. NPTv6 translation, internal network to external network 517 When a datagram passes through the NPTv6 Translator from an internal 518 to an external network, its IPv6 Source Address is changed in two 519 ways: 521 o If the internal subnet number has no mapping, such as being 0xFFFF 522 or simply not mapped, discard the datagram. This SHOULD result in 523 an ICMP Destination Unreachable. 525 o The internal prefix is overwritten with the external prefix, in 526 effect subtracting the difference between the two checksums (the 527 adjustment) from the pseudo-header's checksum, and 529 o A 16-bit word of the address has the adjustment added to it using 530 one's complement arithmetic. If the result is 0xFFFF, it is 531 overwritten as zero. The choice of word is as specified in 532 Section 3.4 or Section 3.5 as appropriate. 534 3.3. NPTv6 translation, external network to internal network 536 When a datagram passes through the NPTv6 Translator from an external 537 to an internal network, its IPv6 Destination Address is changed in 538 two ways: 540 o The external prefix is overwritten with the internal prefix, in 541 effect adding the difference between the two checksums (the 542 adjustment) to the pseudoheader's checksum, and 544 o A 16-bit word of the address has the adjustment subtracted from it 545 (bitwise inverted and added to it) it using one's complement 546 arithmetic. If the result is 0xFFFF, it is overwritten as zero. 547 The choice of word is as specified in Section 3.4 or Section 3.5 548 as appropriate. 550 3.4. NPTv6 with a /48 or shorter prefix 552 When an NPTv6 Translator is configured with internal and external 553 prefixes that are 48 bits in length (a /48) or shorter, the 554 adjustment MUST be added to or subtracted from bits 48..63 of the 555 address. 557 This mapping results in no modification of the Interface Identifier 558 (IID), which is held in the lower half of the IPv6 address, so it 559 will not interfere with future protocols that may use unique IIDs for 560 node identification. 562 NPTv6 Translator implementations MUST implement the /48 mapping. 564 3.5. NPTv6 with a /49 or longer prefix 566 When an NPTv6 Translator is configured with internal and external 567 prefixes that are longer than 48 bits in length (such as a /52, /56, 568 or /60), the adjustment must be added to or subtracted from one of 569 the words in bits 64..79, 80..95, 96..111, or 112..127 of the 570 address. While the choice of word is immaterial as long as it is 571 consistent, for consistency's sake, these words MUST be inspected in 572 that sequence, and the first that is not initially 0xFFFF chosen. 574 NPTv6 Translator implementations SHOULD implement the mapping for 575 longer prefixes. 577 3.6. /48 Prefix Mapping Example 579 For the network shown in Figure 1, the Internal Prefix is FD01:0203: 580 0405:/48, and the External Prefix is 2001:0DB8:0001:/48. 582 If a node with internal address FD01:0203:0405:0001::1234 sends an 583 outbound packet through the NPTv6 Translator, the resulting external 584 address will be 2001:0DB8:0001:D550::1234. The resulting address is 585 obtained by calculating the checksum of both the internal and 586 external 48-bit prefixes, subtracting the internal prefix from the 587 external prefix using one's complement arithmetic to calculate the 588 "adjustment", and adding the adjustment to the 16-bit subnet field 589 (in this case 0x0001). 591 To show the work: 593 The one's complement checksum of FD01:0203:0405 is 0xFCF5. The one's 594 complement checksum of 2001:0DB8:0001 is 0xD245. Using one's 595 complement arithmetic, 0xD245 - 0xFCF5 = 0xD54F. The subnet in the 596 original packet is 0x0001. Using one's complement arithmetic, 0x0001 597 + 0xD54F = 0xD550. Since 0xD550 != 0xFFFF, it is not changed to 598 0x0000. 600 So, the value 0xD550 is written in the 16-bit subnet area, resulting 601 in a mapped external address of 2001:0DB8:0001:D550::1234. 603 When a response packet is received, it will contain the destination 604 address 2001:0DB8:0001:D550::0001, which will be mapped using the 605 inverse mapping algorithm, back to FD01:0203:0405:0001::1234. 607 In this case, the difference between the two prefixes will be 608 calculated as follows: 610 Using one's complement arithmetic, 0xFCF5 - 0xD245 = 0x2AB0. The 611 subnet in the original packet = 0xD550. Using one's complement 612 arithmetic, 0xD550 + 0x2AB0 = 0x0001. Since 0x0001 != 0xFFFF, it is 613 not changed to 0x0000. 615 So the value 0x0001 is written into the subnet field, and the 616 internal value of the subnet field is properly restored. 618 3.7. Address Mapping for Longer Prefixes 620 If the prefix being mapped is longer than 48 bits, the algorithm is 621 slightly more complex. A common case will be that the internal and 622 external prefixes are of different length. In such a case, the 623 shorter prefix is zero-extended to the length of the longer as 624 described in Section 3.1 for the purposes of overwriting the prefix. 625 Then, they are both zero-extended to 64 bits to facilitate one's 626 complement arithmetic. The "adjustment" is calculated using those 64 627 bit prefixes. 629 For example if the internal prefix is a /48 ULA and the external 630 prefix is a /56 provider-allocated prefix, the ULA becomes a /56 with 631 zeros in bits 48..55. For purposes of one's complement arithmetic, 632 they are then both zero-extended to 64 bits. A side-effect of this 633 is that a subset of the subnets possible in the shorter prefix are 634 untranslatable. While the security value of this is debatable, the 635 administration may choose to use them for subnets that it knows need 636 no external accessibility. 638 We then find the first word in the IID that does not have the value 639 0xFFFF, trying bits 64..79, and then 80..95, 96..111, and finally 640 112..127. We perform the same calculation (with the same proof of 641 correctness) as in Section 3.6, but applying it to that word. 643 Although any 16-bit portion of an IPv6 IID could contain 0xFFFF, an 644 IID of all-ones is a reserved anycast identifier that should not be 645 used on the network [RFC2526]. If an NPTv6 Translator discovers a 646 packet with an IID of all-zeros while performing address mapping, 647 that packet MUST be dropped, and an ICMPv6 Parameter Problem error 648 SHOULD be generated [RFC4443]. 650 Note: this mechanism does involve modification of the IID; it may not 651 be compatible with future mechanisms that use unique IIDs for node 652 identification. 654 4. Implications of Network Address Translator Behavioral Requirements 656 4.1. Prefix configuration and generation 658 NPTv6 Translators MUST support manual configuration of internal and 659 external prefixes, and MUST NOT place any restrictions on those 660 prefixes except that they be valid IPv6 unicast prefixes as described 661 in [RFC4291]. They MAY also support random generation of ULA 662 addresses on command. Since the most common place anticipated for 663 the implementation of an NPTv6 Translator is a CPE router, the reader 664 is urged to consider the requirements of 665 [I-D.ietf-v6ops-ipv6-cpe-router]. 667 4.2. Subnet numbering 669 For reasons detailed in Appendix B, a network using NPTv6 Translation 670 and a /48 external prefix MUST NOT use the value 0xFFFF to designate 671 a subnet that it expects to be translated. 673 4.3. NAT Behavioral Requirements 675 NPTv6 Translators MUST support hairpinning behavior, as defined in 676 the NAT Behavioral Requirements for UDP document [RFC4787]. This 677 means that when an NPTv6 Translator receives a packet on the internal 678 interface that has a destination address that matches the site's 679 external prefix, it will translate the packet and forward it 680 internally. This allows internal nodes to reach other internal nodes 681 using their external, global addresses when necessary. 683 Conceptually, the datagram leaves the domain (is translated as 684 described in Section 3.2), and returns (is again translated as 685 described in Section 3.3). As a result, the datagram exchange will 686 be through the NPTv6 Translator in both directions for the lifetime 687 of the session. The alternative would be to require the NPTv6 688 Translator to drop the datagram, forcing the sender to use the 689 correct internal prefix for its peer. Performing only the external- 690 to-internal translation results in the datagram being sent from the 691 untranslated internal address of the source to the translated and 692 therefore internal address of its peer, which would enable the 693 session to bypass the NPTv6 Translator for future datagrams. It 694 would also mean that the original sender would be unlikely to 695 recognize the response when it arrived. 697 Because NPTv6 does not perform port mapping and uses a one-to-one, 698 reversible mapping algorithm, none of the other NAT behavioral 699 requirements apply to NPTv6. 701 5. Implications for Applications 703 NPTv6 Translation does not create several of the problems known to 704 exist with other kinds of NATs and discussed in [RFC2993]. In 705 particular: NPTv6 Translation is stateless, so a "reset" or brief 706 outage of an NPTv6 Translator does not break connections that 707 traverse the translation function, and if multiple NPTv6 Translators 708 exist between the same two networks, load can shift or be dynamically 709 load-shared among them. Also, an NPTv6 Translator does not aggregate 710 traffic for several hosts/interfaces behind a lesser number of 711 external addresses, so there is no inherent expectation for an NPTv6 712 Translator to block new inbound flows from external hosts, and no 713 issue with a filter or blacklist associated with one prefix within 714 the domain affecting another. A firewall can of course be used in 715 conjunction with NPTv6 Translator; this would allow the network 716 administrator more flexibility to specify security policy than would 717 be possible with a traditional NAT. 719 However, NPTv6 Translation does create difficulties for some kinds of 720 applications. Some examples include: 722 o An application instance "behind" an NPTv6 Translator will see a 723 different address for its connections than its peers "outside" the 724 NPTv6 Translator. 726 o An application instance "outside" an NPTv6 Translator will see a 727 different address for its connections than any peer "inside" an 728 NPTv6 Translator. 730 o An application instance wishing to establish communication with a 731 peer "behind" an NPTv6 Translator may need to use a different 732 address to reach that peer depending on whether the instance is 733 behind the same NPTv6 Translator or external to it. Since an 734 NPTv6 Translator implements hairpinning (Section 4.3), it suffices 735 for applications to always use their external addresses. However, 736 this creates inefficiencies in the local network and may also 737 complicate implementation of the NPTv6 Translator. [RFC3484] also 738 would prefer the private address in such a case in order to reduce 739 those inefficiencies. 741 o An application instance which moves from a realm "behind" an NPTv6 742 Translator to a realm that is "outside" the network, or vice 743 versa, may find that it is no longer able to reach its peers at 744 the same addresses it was previously able to use. 746 o An application instance which is intermittently communicating with 747 a peer that moves from behind an NPTv6 Translator to "outside" of 748 it, or vice versa, may find that it is no longer able to reach 749 that peer at the same address that it had previously used. 751 Many, but not all, of the applications which are adversely affected 752 by NPTv6 Translation are those that do "referrals" - where an 753 application instance passes its own addresses, and/or addresses of 754 its peers, to other peers. (Some believe referrals are inherently 755 undesirable; others believe that they are necessary in some 756 circumstances. A discussion of the merits of referrals, or lack 757 thereof, is beyond the scope of this document.) 759 To some extent, the incidence of these difficulties can be reduced by 760 DNS hacks that attempt to expose addresses "behind" an NPTv6 761 Translator only to hosts which are also behind the same NPTv6 762 Translator; and perhaps also, to expose only the "internal" addresses 763 of hosts behind the NPTv6 Translator to other hosts behind the same 764 NPTv6 Translator. However, this cannot be a complete solution. A 765 full discussion of these issues is out of scope for this document, 766 but briefly: (a) reliance on DNS to solve this problem depends on 767 hosts always making queries from DNS servers in the same realm as 768 they are (or on DNS interception proxies, which create their own 769 problems), and on mobile hosts/applications not caching those 770 results; (b) reliance on DNS to solve this problem depends on network 771 administrators on all networks using such applications to reliably 772 and accurately maintain current DNS entries for every host using 773 those applications; and (c) reliance on DNS to solve this problem 774 depends on applications always using DNS names, even though they 775 often must run in environments where DNS names are not reliably 776 maintained for every host. Other issues are that there is often no 777 single distinguished name for a host, no reliable way for a host to 778 determine what DNS names are associated with it, and which names are 779 appropriate to use in which contexts. 781 5.1. Recommendation for network planners considering use of NPTv6 782 Translation 784 In light of the above, network planners considering the use of NPTv6 785 translation should carefully consider the kinds of applications that 786 they will need to run in the future, and determine whether the 787 address stability and provider independence benefits are consistent 788 with their application requirements. 790 5.2. Recommendations for application writers 792 Several mechanisms (e.g. STUN, TURN, ICE) have been used with 793 traditional IPv4 NAT to circumvent some of the limitations of such 794 devices. Similar mechanisms could also be applied to circumvent some 795 of the issues with NPTv6 Translator. However, all of these require 796 the assistance of an external server or a function co-located with 797 the translator that can tell an "internal" host what its "external" 798 addresses are. 800 5.3. Recommendation for future work 802 It might be desirable to define a general mechanism which would allow 803 hosts within a translation domain to determine their external 804 addresses and/or request that inbound traffic be permitted. If such 805 a mechanism were to be defined, it would ideally be general enough to 806 also accommodate other types of NAT likely to be encountered by IPV6 807 applications - in particular, IPv4/IPv6 Translation 808 [I-D.ietf-behave-v6v4-framework] [I-D.ietf-behave-dns64] 809 [I-D.ietf-behave-v6v4-xlate] [I-D.ietf-behave-v6v4-xlate-stateful] 810 [RFC6052]. For this and other reasons, such a mechanism is beyond 811 the scope of this document. 813 6. A Note on Port Mapping 815 In addition to overwriting IP addresses when packets are forwarded, 816 NAPT44 devices overwrite the source port number in outbound traffic, 817 and the destination port number in inbound traffic. This mechanism 818 is called "port mapping". 820 The major benefit of port mapping is that it allows multiple 821 computers to share a single IPv4 address. A large number of internal 822 IPv4 addresses (typically from one of the [RFC1918] private address 823 spaces) can be mapped into a single external, globally routable IPv4 824 address, with the local port number used to identify which internal 825 node should receive each inbound packet. This address amplification 826 feature is not generally foreseen as a necessity at this time. 828 Since port mapping requires re-writing a portion of the transport 829 layer header, it requires NAPT44 devices to be aware of all of the 830 transport protocols that they forward, thus stifling the development 831 of new and improved transport protocols and preventing the use of 832 IPsec encryption. Modifying the transport layer header is 833 incompatible with security mechanisms that encrypt the full IP 834 payload, and restricts the NAPT44 to forwarding transport layers that 835 use weak checksum algorithms that are easily recalculated in routers. 837 Since there is significant detriment caused by modifying transport 838 layer headers and very little, if any, benefit to the use of port 839 mapping in IPv6, NPTv6 Translators that comply with this 840 specification MUST NOT perform port mapping. 842 7. Security Considerations 844 When NPTv6 is deployed using either of the two-way, algorithmic 845 mappings defined in the document, it allows direct inbound 846 connections to internal nodes. While this can be viewed as a benefit 847 of NPTv6 vs. NAPT44, it does open internal nodes to attacks that 848 would be more difficult in a NAPT44 network. Although this situation 849 is not substantially worse, from a security standpoint, than running 850 IPv6 with no NAT, some enterprises may assume that an NPTv6 851 Translator will offer similar protection to a NAPT44 device. 853 The port mapping mechanism in NAPT44 implementations requires that 854 state be created in both directions. This has lead to an industry- 855 wide perception that NAT functionality is the same as a stateful 856 firewall. It is not. The translation function of the NAT only 857 creates dynamic state in one direction and has no policy. For this 858 reason, it is RECOMMENDED that NPTv6 Translators also implement 859 firewall functionality such as described in [RFC6092], with 860 appropriate configuration options including turning it on or off. 862 When [RFC4864] talks about randomizing the subnet identifier, the 863 idea is to make it harder for worms to guess a valid subnet 864 identifier at an advertised network prefix. This should not be 865 interpreted as endorsing concealing the subnet identifier behind the 866 obfuscating function of a translator such as NPTv6. [RFC4864] 867 specifically talks about how to obtain the desired properties of 868 concealment without using a translator. Topology hiding when using 869 NAT is often ineffective in environments where the topology is 870 visible in application layer messaging protocols such as DNS, SIP, 871 SMTP, etc. If the information were not available through the 872 application layer, [RFC2993] would not be valid. 874 Due to the potential interactions with IKEv2/IPsec NAT traversal, it 875 would be valuable to test interactions of NPTv6 with various aspects 876 of current-day IKEv2/IPsec NAT traversal. 878 8. IANA Considerations 880 This document has no IANA considerations. 882 9. Acknowledgements 884 The checksum-neutral algorithmic address mapping described in this 885 document is based on e-mail written by Iljtsch van Beijnum. 887 The following people provided advice or review comments that 888 substantially improved this document: Allison Mankin, Christian 889 Huitema, Dave Thaler, Ed Jankiewicz, Eric Kline, Iljtsch van Beijnum, 890 Jari Arkko, Keith Moore, Mark Townsley, Merike Kaeo, Ralph Droms, 891 Remi Depres, Steve Blake, and Tony Hain. 893 This document was written using the xml2rfc tool described in RFC 894 2629 [RFC2629]. 896 10. Change Log 898 This section should be removed by the RFC Editor. 900 10.1. Changes Between draft-mrw-behave-nat66-00 and -01 902 There were several minor changes made between the *behave-nat66-00 903 and -01 versions of this draft: 905 o Added Fred Baker as a co-author. 907 o Minor arithmetic corrections. 909 o Added AH to paragraph on NAT security issues. 911 o Added additional NAT topologies to overview (diagrams TBD). 913 10.2. Changes between *behave-nat66-01 and -02 915 There were further changes made between *behave-nat66-01 and -02: 917 o Removed topology hiding mechanism. 919 o Added diagrams. 921 o Made minor updates based on mailing list feedback. 923 o Added discussion of IPv6 SAF document. 925 o Added applicability section. 927 o Added discussion of Address Independence requirement. 929 o Added hairpinning requirement and discussion of applicability of 930 other NAT behavioral requirements. 932 10.3. Changes between *nat66-00 and *nat66-01 934 There were further changes made between nat66-01 and nat66-02: 936 o Added mapping for prefixes longer than /48. 938 o Change draft name to remove reference to the behave WG. 940 o Resolved various open issues and fixed typos. 942 10.4. Changes between *nat66-01 and *nat66-02 944 o Change the acronym "NAT66" to "NPTv6", so people don't read "NAT" 945 and MEGO. 947 o Change the term used to refer to the function from "NAT66 device" 948 to "NPTv6 Translator". It's not a "device" function, it's a 949 function that is applied between two interfaces. Consider a 950 router with two upstreams and two legs in the local network; it 951 will not translate between the local legs, but will translate to 952 and from each upstream, and be configured differently for each of 953 the two ISPs. 955 o Comment specifically on the security aspects. 957 o Comment specifically on the application issues raised on this 958 list. 960 o Comment specifically on multihoming, load-sharing, and asymmetric 961 routing. 963 o Spell out the hairpinning requirement and its implications. 965 o Spell out the service provider side of Address Independence. 967 o 00 focuses on the edge's view 969 o Detail the algorithm in a manner clearer to the implementor (I 970 think) 972 o Spell out the case for GSE-style DMZs between the edge and the 973 transit network, which is about the implications for the global 974 routing table. 976 o Refer to [RFC6092] as a CPE firewall description. 978 10.5. Changes between *nat66-02 and *nat66-03 980 o Added an appendix on Verification code 982 o Various minor markups in response to Ralph Droms 984 10.6. Changes between *nat66-03 and *nat66-04 986 o Markups in response to Christian Huitema, mostly surrounding the 987 issue of subnet 0xFFFF. 989 o Refer to [I-D.ietf-v6ops-ipv6-cpe-router] for CPE router 990 requirements. 992 10.7. Changes between *nat66-04 and *nat66-05 994 o Update statistics in appendix A per BGP report of 17 December 2010 996 o Update security considerations using text supplied by Merike Kaeo. 998 10.8. Changes between *nat66-05 and *nat66-06 1000 o restore a code snippet inadvertently removed in version -05 1002 10.9. Changes between *nat66-06 and *nat66-07 1004 o Changed requested status to experimental 1006 o Incorporated comments from Eric Kline 1008 10.10. Changes between *nat66-07 and *nat66-08 1010 The section on Application Considerations was expanded after 1011 discussion with Keith Moore. 1013 10.11. Changes up to *nat66-10 1015 Address review comments during IETF Last Call and the Transport 1016 Directorate Review. 1018 10.12. Changes up to *nat66-11 and -12 1020 Address Dave Thaler's comments, mostly editorial, bit also addressing 1021 UNSAF protocols like the TCP Authentication Option. 1023 11. References 1025 11.1. Normative References 1027 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1028 Requirement Levels", BCP 14, RFC 2119, March 1997. 1030 [RFC2526] Johnson, D. and S. Deering, "Reserved IPv6 Subnet Anycast 1031 Addresses", RFC 2526, March 1999. 1033 [RFC4193] Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast 1034 Addresses", RFC 4193, October 2005. 1036 [RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing 1037 Architecture", RFC 4291, February 2006. 1039 [RFC4443] Conta, A., Deering, S., and M. Gupta, "Internet Control 1040 Message Protocol (ICMPv6) for the Internet Protocol 1041 Version 6 (IPv6) Specification", RFC 4443, March 2006. 1043 [RFC4787] Audet, F. and C. Jennings, "Network Address Translation 1044 (NAT) Behavioral Requirements for Unicast UDP", BCP 127, 1045 RFC 4787, January 2007. 1047 11.2. Informative References 1049 [GSE] O'Dell, M., "GSE - An Alternate Addressing Architecture 1050 for IPv6", February 1997, 1051 . 1053 [I-D.ietf-behave-dns64] 1054 Bagnulo, M., Sullivan, A., Matthews, P., and I. Beijnum, 1055 "DNS64: DNS extensions for Network Address Translation 1056 from IPv6 Clients to IPv4 Servers", 1057 draft-ietf-behave-dns64-11 (work in progress), 1058 October 2010. 1060 [I-D.ietf-behave-v6v4-framework] 1061 Baker, F., Li, X., Bao, C., and K. Yin, "Framework for 1062 IPv4/IPv6 Translation", 1063 draft-ietf-behave-v6v4-framework-10 (work in progress), 1064 August 2010. 1066 [I-D.ietf-behave-v6v4-xlate] 1067 Li, X., Bao, C., and F. Baker, "IP/ICMP Translation 1068 Algorithm", draft-ietf-behave-v6v4-xlate-23 (work in 1069 progress), September 2010. 1071 [I-D.ietf-behave-v6v4-xlate-stateful] 1072 Bagnulo, M., Matthews, P., and I. Beijnum, "Stateful 1073 NAT64: Network Address and Protocol Translation from IPv6 1074 Clients to IPv4 Servers", 1075 draft-ietf-behave-v6v4-xlate-stateful-12 (work in 1076 progress), July 2010. 1078 [I-D.ietf-v6ops-ipv6-cpe-router] 1079 Singh, H., Beebee, W., Donley, C., Stark, B., and O. 1080 Troan, "Basic Requirements for IPv6 Customer Edge 1081 Routers", draft-ietf-v6ops-ipv6-cpe-router-09 (work in 1082 progress), December 2010. 1084 [NIST] NIST, "Draft NIST Framework and Roadmap for Smart Grid 1085 Interoperability, Release 1.0", September 2009. 1087 [RFC1071] Braden, R., Borman, D., Partridge, C., and W. Plummer, 1088 "Computing the Internet checksum", RFC 1071, 1089 September 1988. 1091 [RFC1624] Rijsinghani, A., "Computation of the Internet Checksum via 1092 Incremental Update", RFC 1624, May 1994. 1094 [RFC1918] Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G., and 1095 E. Lear, "Address Allocation for Private Internets", 1096 BCP 5, RFC 1918, February 1996. 1098 [RFC2629] Rose, M., "Writing I-Ds and RFCs using XML", RFC 2629, 1099 June 1999. 1101 [RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: 1102 Defeating Denial of Service Attacks which employ IP Source 1103 Address Spoofing", BCP 38, RFC 2827, May 2000. 1105 [RFC2993] Hain, T., "Architectural Implications of NAT", RFC 2993, 1106 November 2000. 1108 [RFC3424] Daigle, L. and IAB, "IAB Considerations for UNilateral 1109 Self-Address Fixing (UNSAF) Across Network Address 1110 Translation", RFC 3424, November 2002. 1112 [RFC3484] Draves, R., "Default Address Selection for Internet 1113 Protocol version 6 (IPv6)", RFC 3484, February 2003. 1115 [RFC4864] Van de Velde, G., Hain, T., Droms, R., Carpenter, B., and 1116 E. Klein, "Local Network Protection for IPv6", RFC 4864, 1117 May 2007. 1119 [RFC5902] Thaler, D., Zhang, L., and G. Lebovitz, "IAB Thoughts on 1120 IPv6 Network Address Translation", RFC 5902, July 2010. 1122 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 1123 Authentication Option", RFC 5925, June 2010. 1125 [RFC5996] Kaufman, C., Hoffman, P., Nir, Y., and P. Eronen, 1126 "Internet Key Exchange Protocol Version 2 (IKEv2)", 1127 RFC 5996, September 2010. 1129 [RFC6052] Bao, C., Huitema, C., Bagnulo, M., Boucadair, M., and X. 1130 Li, "IPv6 Addressing of IPv4/IPv6 Translators", RFC 6052, 1131 October 2010. 1133 [RFC6092] Woodyatt, J., "Recommended Simple Security Capabilities in 1134 Customer Premises Equipment (CPE) for Providing 1135 Residential IPv6 Internet Service", RFC 6092, 1136 January 2011. 1138 Appendix A. Why GSE? 1140 For the purpose of this discussion, let us over-simplify the 1141 Internet's structure by distinguishing between two broad classes of 1142 networks: transit and edge. A "transit network", in this context, is 1143 a network that provides connectivity services to other networks. Its 1144 AS number may show up in a non-final position in BGP AS paths, or in 1145 the case of mobile and residential broadband networks, it may offer 1146 network services to smaller networks that can't justify RIR 1147 membership. An "edge network", in contrast, is any network that is 1148 not a transit network; it is the ultimate customer, and while it 1149 provides internal connectivity for its own use, it is in other 1150 respects a consumer of transit services. In terms of routing, a 1151 network in the transit domain generally needs some way to make 1152 choices about how it routes to other networks; an edge network is 1153 generally quite satisfied with a simple default route. 1155 The [GSE] proposal, and as a result this proposal (which is similar 1156 to GSE in most respects and inspired by it), responds directly to 1157 current concerns in the RIR communities. Edge networks are used to 1158 an environment in IPv4 in which their addressing is disjoint from 1159 that of their upstream transit networks; it is either provider 1160 independent, or a network prefix translator makes their external 1161 address distinct from their internal address, and they like the 1162 distinction. In IPv6, there is a mantra that edge network addresses 1163 should be derived from their upstream, and if they have multiple 1164 upstreams, edge networks are expected to design their networks to use 1165 all of those prefixes equivalently. They see this as unnecessary and 1166 unwanted operational complexity, and are as a result pushing very 1167 hard in the RIR communities for provider independent addressing. 1169 Widespread use of provider independent addressing has a natural and 1170 perhaps unavoidable side-effect that is likely to be very expensive 1171 in the long term. It means that the routing table will enumerate the 1172 networks at the edge of the transit domain, the edge networks, rather 1173 than enumerating the transit domain. Per the BGP Update Report of 17 1174 December 2010, there are currently over 36,000 Autonomous Systems 1175 being advertised in BGP, of which over 15,000 advertise only one 1176 prefix. There are in the neighborhood of 5000 AS's that show up in a 1177 non-final position in AS paths, and perhaps another 5000 networks 1178 whose AS numbers are terminal in more than one AS path. In other 1179 words, we have prefixes for some 36,000 transit and edge networks in 1180 the route table now, many of which arguably need an Autonomous System 1181 number only for multihoming. Current estimates suggest that we could 1182 easily see that be on the order of 10,000,000 within fifteen years. 1183 However, the vast majority of networks (2/3) having the tools 1184 necessary to multihome are not visibly doing so, and would be well 1185 served by any solution that gives them address independence without 1186 the overhead of RIR membership and BGP routing. 1188 Current growth estimates suggest that we could easily see that be on 1189 the order of 10,000,000 within fifteen years. Tens of thousands of 1190 entries in the route table is very survivable; while our protocols 1191 and computers will likely do quite well with tens of millions of 1192 routes, the heat produced and power consumed by those routers, and 1193 the inevitable impact on the cost of those routers, is not a good 1194 outcome. To avoid having a massive and unscalable route table, we 1195 need to find a way that is politically acceptable and returns us to 1196 enumerating the transit domain, not the edge. 1198 There have been a number of proposals. As described, shim6 moves the 1199 complexity to the edge, and the edge is rebelling. Geographic 1200 addressing in essence forces ISPs to "own" geographic territory from 1201 a routing perspective, as otherwise there is no clue in the address 1202 as to what network a datagram should be delivered to in order to 1203 reach it. Metropolitan Addressing can imply regulatory authority, 1204 and even if it is implemented using internet exchange consortia, 1205 visits a great deal of complexity on the transit networks that 1206 directly serve the edge. The one that is likely to be most 1207 acceptable is any proposal that enables an edge network to be 1208 operationally independent of its upstreams, with no obligation to 1209 renumber when it adds, drops, or changes ISPs, and with no additional 1210 burden placed either on the ISP or the edge network as a result. 1211 From an application perspective, an additional operational 1212 requirement in the words of Roadmap for the Smart Grid [NIST], is 1213 that 1215 "...the Network should enable an application in a particular 1216 domain to communicate with an application in any other domain in 1217 the information network, with proper management control over who 1218 and where applications can be interconnected." 1220 In other words, the structure of the network should allow for and 1221 enable appropriate access control, but the structure of the network 1222 should not inherently limit access. 1224 The GSE model, by statelessly translating the prefix between an edge 1225 network and its upstream transit network, accomplishes that with a 1226 minimum of fuss and bother. Stated in the simplest terms, it enables 1227 the edge network to behave as if it has a provider independent prefix 1228 from a multihoming and renumbering perspective without the overhead 1229 of RIR membership or maintaining BGP connectivity, and it enables the 1230 transit networks to aggressively aggregate what are from their 1231 perspective provider-allocated customer prefixes, to maintain a 1232 rational-sized routing table. 1234 Appendix B. Verification code 1236 This non-normative appendix is presented as a proof of concept. It 1237 is in no sense optimized; for example, one's complement arithmetic is 1238 implemented in portable subroutines, where operational 1239 implementations might use one's complement arithmetic instructions 1240 through a pragma; such implementations probably need to explicitly 1241 force 0xFFFF to 0x0000, as the instruction will not. The original 1242 purpose of the code was to verify whether or not it was necessary to 1243 suppress 0xFFFF by overwriting with zero, and whether predicted 1244 issues with subnet numbering were real. 1246 The point is to 1248 o demonstrate that if one or the other representation of zero is not 1249 used in the word the checksum is updated in, the program maps 1250 inner and outer addresses in a manner that is, mathematically, 1:1 1251 and onto (each inner address maps to a unique outer address, and 1252 that outer address maps back to exactly the same inner address), 1253 and 1255 o give guidance on the suppression of 0xFFFF checksums. 1257 In short, in one's complement arithmetic, x-x=0, but will take the 1258 negative representation of zero. If 0xFFFF results are forced to the 1259 value 0x0000, as is recommended in [RFC1071], the word the checksum 1260 is adjusted in cannot be initially 0xFFFF, as on the return it will 1261 be forced to 0. If 0xFFFF results are not forced to the value 0x0000 1262 as is recommended in [RFC1071], the word the checksum is adjusted in 1263 cannot be initially 0, as on the return it will be calculated as 1264 0+(~0) = 0xFFFF. We chose to follow [RFC1071]'s recommendations, 1265 which implies a requirement to not use 0xFFFF as a subnet number in 1266 networks with a /48 external prefix. 1268 /* 1269 * Copyright (c) 2010 IETF Trust and the persons identified as 1270 * authors of the code. All rights reserved. Redistribution 1271 * and use in source and binary forms, with or without 1272 * modification, are permitted provided that the following 1273 * conditions are met: 1274 * 1275 * o Redistributions of source code must retain the above 1276 * copyright notice, this list of conditions and the 1277 * following disclaimer. 1278 * 1279 * o Redistributions in binary form must reproduce the above 1280 * copyright notice, this list of conditions and the 1281 * following disclaimer in the documentation and/or other 1282 * materials provided with the distribution. 1283 * 1284 * o Neither the name of Internet Society, IETF or IETF Trust, 1285 * nor the names of specific contributors, may be used to 1286 * endorse or promote products derived from this software 1287 * without specific prior written permission. 1288 * 1289 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND 1290 * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, 1291 * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 1292 * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 1293 * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR 1294 * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 1295 * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT 1296 * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 1297 * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 1298 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 1299 * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR 1300 * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 1301 * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 1302 */ 1303 #include "stdio.h" 1304 #include "assert.h" 1305 /* 1306 * program to verify the NPTv6 algorithm 1307 * 1308 * argument: 1309 * perform negative zero suppression: boolean 1310 * 1311 * method: 1312 * We specify an internal and an external prefix. The prefix 1313 * length is presumed to be the common length of both, and for 1314 * this is a /48. We perform the three algorithms specified. 1315 * the "packet" address is in effect the source address 1316 * internal->external and the destination address 1317 * external->internal. 1318 */ 1319 unsigned short inner_init[] = { 1320 0xFD01, 0x0203, 0x0405, 1, 2, 3, 4, 5}; 1321 unsigned short outer_init[] = { 1322 0x2001, 0x0db8, 0x0001, 1, 2, 3, 4, 5}; 1323 unsigned short inner[8]; 1324 unsigned short packet[8]; 1325 unsigned char checksum[65536] = {0}; 1326 unsigned short outer[8]; 1327 unsigned short adjustment; 1328 unsigned short suppress; 1329 /* 1330 * One's complement sum. 1331 * return number1 + number2 1332 */ 1333 unsigned short 1334 add1(number1, number2) 1335 unsigned short number1; 1336 unsigned short number2; 1337 { 1338 unsigned int result; 1340 result = number1; 1341 result += number2; 1342 if (suppress) { 1343 while (0xFFFF <= result) { 1344 result = result + 1 - 0x10000; 1345 } 1346 } else { 1347 while (0xFFFF < result) { 1348 result = result + 1 - 0x10000; 1349 } 1350 } 1351 return result; 1352 } 1354 /* 1355 * One's complement difference 1356 * return number1 - number2 1357 */ 1358 unsigned short 1359 sub1(number1, number2) 1360 unsigned short number1; 1361 unsigned short number2; 1362 { 1363 return add1(number1, ~number2); 1364 } 1366 /* 1367 * return one's complement sum of an array of numbers 1368 */ 1369 unsigned short 1370 sum1(numbers, count) 1371 unsigned short *numbers; 1372 int count; 1373 { 1374 unsigned int result; 1376 result = *numbers++; 1377 while (--count > 0) { 1378 result += *numbers++; 1379 } 1381 if (suppress) { 1382 while (0xFFFF <= result) { 1383 result = result + 1 - 0x10000; 1384 } 1385 } else { 1386 while (0xFFFF < result) { 1387 result = result + 1 - 0x10000; 1388 } 1389 } 1390 return result; 1391 } 1393 /* 1394 * NPTv6 initialization: section 3.1 assuming section 3.4 1395 * 1396 * create the /48, a source address in internal format, and a 1397 * source address in external format. calculate the adjustment 1398 * if one /48 is overwritten with the other. 1399 */ 1400 void 1401 nptv6_initialization(subnet) 1402 unsigned short subnet; 1403 { 1404 int i; 1405 unsigned short inner48; 1406 unsigned short outer48; 1408 /* initialize the internal and external prefixes. */ 1409 for (i = 0; i < 8; i++) { 1410 inner[i] = inner_init[i]; 1411 outer[i] = outer_init[i]; 1412 } 1413 inner[3] = subnet; 1414 outer[3] = subnet; 1415 /* calculate the checksum adjustment */ 1416 inner48 = sum1(inner, 3); 1417 outer48 = sum1(outer, 3); 1418 adjustment = sub1(inner48, outer48); 1419 } 1421 /* 1422 * NPTv6 packet from edge to transit: section 3.2 assuming 1423 * section 3.4 1424 * 1425 * overwrite the prefix in the source address with the outer 1426 * prefix, and adjust the checksum 1427 */ 1428 void 1429 nptv6_inner_to_outer() 1430 { 1431 int i; 1433 /* let's get the source address into the packet */ 1434 for (i = 0; i < 8; i++) { 1435 packet[i] = inner[i]; 1436 } 1437 /* overwrite the prefix with the outer prefix */ 1438 for (i = 0; i < 3; i++) { 1439 packet[i] = outer[i]; 1440 } 1442 /* adjust the checksum */ 1443 packet[3] = add1(packet[3], adjustment); 1444 } 1446 /* 1447 * NPTv6 packet from transit to edge:: section 3.3 assuming 1448 * section 3.4 1449 * 1450 * overwrite the prefix in the destination address with the 1451 * inner prefix, and adjust the checksum 1452 */ 1453 void 1454 nptv6_outer_to_inner() 1455 { 1456 int i; 1458 /* overwrite the prefix with the outer prefix */ 1459 for (i = 0; i < 3; i++) { 1460 packet[i] = inner[i]; 1461 } 1463 /* adjust the checksum */ 1464 packet[3] = sub1(packet[3], adjustment); 1465 } 1467 /* 1468 * main program 1469 */ 1470 main(argc, argv) 1471 int argc; 1472 char **argv; 1473 { 1474 unsigned subnet; 1475 int i; 1477 if (argc < 2) { 1478 fprintf(stderr, "usage: nptv6 supression\n"); 1479 assert(0); 1480 } 1481 suppress = atoi(argv[1]); 1482 assert(suppress <= 1); 1484 for (subnet = 0; subnet < 0x10000; subnet++) { 1485 /* section 3.1: initialize the system */ 1486 nptv6_initialization(subnet); 1488 /* section 3.2: take a packet from inside to outside */ 1489 nptv6_inner_to_outer(); 1491 /* the resulting checksum value should be unique */ 1492 if (checksum[subnet]) { 1493 printf("inner->outer duplicated checksum: " 1494 "inner: %x:%x:%x:%x:%x:%x:%x:%x(%x) " 1495 "calculated: %x:%x:%x:%x:%x:%x:%x:%x(%x)\n", 1496 inner[0], inner[1], inner[2], inner[3], 1497 inner[4], inner[5], inner[6], inner[7], 1498 sum1(inner, 8), 1499 packet[0], packet[1], packet[2], packet[3], 1500 packet[4], packet[5], packet[6], packet[7], 1501 sum1(packet, 8)); 1502 } 1504 checksum[subnet] = 1; 1506 /* 1507 * the resulting checksum should be the same as the inner 1508 * address's checksum 1509 */ 1510 if (sum1(packet, 8) != sum1(inner, 8)) { 1511 printf("inner->outer incorrect: " 1512 "inner: %x:%x:%x:%x:%x:%x:%x:%x(%x) " 1513 "calculated: %x:%x:%x:%x:%x:%x:%x:%x(%x)\n", 1514 inner[0], inner[1], inner[2], inner[3], 1515 inner[4], inner[5], inner[6], inner[7], 1516 sum1(inner, 8), 1517 packet[0], packet[1], packet[2], packet[3], 1518 packet[4], packet[5], packet[6], packet[7], 1519 sum1(packet, 8)); 1520 } 1522 /* section 3.3: take a packet from outside to inside */ 1523 nptv6_outer_to_inner(); 1525 /* 1526 * the returning packet should have the same checksum it 1527 * left with 1528 */ 1529 if (sum1(packet, 8) != sum1(inner, 8)) { 1530 printf("outer->inner checksum incorrect: " 1531 "calculated: %x:%x:%x:%x:%x:%x:%x:%x(%x) " 1532 "inner: %x:%x:%x:%x:%x:%x:%x:%x(%x)\n", 1533 packet[0], packet[1], packet[2], packet[3], 1534 packet[4], packet[5], packet[6], packet[7], 1535 sum1(packet, 8), inner[0], inner[1], inner[2], 1536 inner[3], inner[4], inner[5], inner[6], 1537 inner[7], sum1(inner, 8)); 1538 } 1540 /* 1541 * and every octet should calculate back to the same inner 1542 * value 1543 */ 1544 for (i = 0; i < 8; i++) { 1545 if (inner[i] != packet[i]) { 1546 printf("outer->inner different: " 1547 "calculated: %x:%x:%x:%x:%x:%x:%x:%x " 1548 "inner: %x:%x:%x:%x:%x:%x:%x:%x\n", 1549 packet[0], packet[1], packet[2], packet[3], 1550 packet[4], packet[5], packet[6], packet[7], 1551 inner[0], inner[1], inner[2], inner[3], 1552 inner[4], inner[5], inner[6], inner[7]); 1553 break; 1554 } 1555 } 1556 } 1557 } 1559 Authors' Addresses 1561 Margaret Wasserman 1562 Painless Security 1563 North Andover, MA 01845 1564 USA 1566 Phone: +1 781 405 7464 1567 Email: mrw@painless-security.com 1568 URI: http://www.painless-security.com 1570 Fred Baker 1571 Cisco Systems 1572 Santa Barbara, California 93117 1573 USA 1575 Phone: +1-408-526-4257 1576 Email: fred@cisco.com