idnits 2.17.1 draft-mrw-nat66-10.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC2119]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 1307 has weird spacing: '...d short inner...' == Line 1309 has weird spacing: '...d short outer...' == Line 1311 has weird spacing: '...d short inner...' == Line 1312 has weird spacing: '...d short packe...' == Line 1313 has weird spacing: '...ed char chec...' == (12 more instances...) -- The document date (March 10, 2011) is 4795 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Experimental ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '3' on line 1541 -- Looks like a reference, but probably isn't: '1' on line 1541 -- Looks like a reference, but probably isn't: '0' on line 1541 -- Looks like a reference, but probably isn't: '2' on line 1541 -- Looks like a reference, but probably isn't: '4' on line 1542 -- Looks like a reference, but probably isn't: '5' on line 1542 -- Looks like a reference, but probably isn't: '6' on line 1542 -- Looks like a reference, but probably isn't: '7' on line 1542 -- Obsolete informational reference (is this intentional?): RFC 2629 (Obsoleted by RFC 7749) -- Obsolete informational reference (is this intentional?): RFC 3484 (Obsoleted by RFC 6724) -- Obsolete informational reference (is this intentional?): RFC 5996 (Obsoleted by RFC 7296) Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 13 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Wasserman 3 Internet-Draft Painless Security 4 Intended status: Experimental F. Baker 5 Expires: September 11, 2011 Cisco Systems 6 March 10, 2011 8 IPv6-to-IPv6 Network Prefix Translation 9 draft-mrw-nat66-10 11 Abstract 13 This document describes a stateless, transport-agnostic IPv6-to-IPv6 14 Network Prefix Translation (NPTv6) function that provides the address 15 independence benefit associated with IPv4-to-IPv4 NAT (NAPT44), and 16 in addition provides a 1:1 relationship between addresses in the 17 "inside" and "outside" prefixes, preserving end to end reachability 18 at the network layer. 20 Requirements Terminology 22 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 23 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 24 document are to be interpreted as described in RFC 2119 [RFC2119]. 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on September 11, 2011. 43 Copyright Notice 45 Copyright (c) 2011 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 1.1. What is Address Independence? . . . . . . . . . . . . . . 4 62 1.2. NPTv6 Applicability . . . . . . . . . . . . . . . . . . . 6 63 2. NPTv6 Overview . . . . . . . . . . . . . . . . . . . . . . . . 7 64 2.1. NPTv6: the simplest case . . . . . . . . . . . . . . . . 7 65 2.2. NPTv6 between peer networks . . . . . . . . . . . . . . . 9 66 2.3. NPTv6 redundnacy and load-sharing . . . . . . . . . . . . 9 67 2.4. NPTv6 multihoming . . . . . . . . . . . . . . . . . . . . 10 68 2.5. Mapping with No Per-Flow State . . . . . . . . . . . . . 10 69 2.6. Checksum-Neutral Mapping . . . . . . . . . . . . . . . . 11 70 3. NPTv6 Algorithmic Specification . . . . . . . . . . . . . . . 11 71 3.1. NPTv6 configuration calculations . . . . . . . . . . . . 11 72 3.2. NPTv6 translation, internal network to external 73 network . . . . . . . . . . . . . . . . . . . . . . . . . 12 74 3.3. NPTv6 translation, external network to internal 75 network . . . . . . . . . . . . . . . . . . . . . . . . . 12 76 3.4. NPTv6 with a /48 or shorter prefix . . . . . . . . . . . 13 77 3.5. NPTv6 with a /49 or longer prefix . . . . . . . . . . . . 13 78 3.6. /48 Prefix Mapping Example . . . . . . . . . . . . . . . 13 79 3.7. Address Mapping for Longer Prefixes . . . . . . . . . . . 14 80 4. Implications of Network Address Translator Behavioral 81 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 15 82 4.1. Prefix configuration and generation . . . . . . . . . . . 15 83 4.2. Subnet numbering . . . . . . . . . . . . . . . . . . . . 15 84 4.3. NAT Behavioral Requirements . . . . . . . . . . . . . . . 15 85 5. Implications for Applications . . . . . . . . . . . . . . . . 16 86 5.1. Recommendation for network planners considering use 87 of NPTv6 Translator . . . . . . . . . . . . . . . . . . . 18 88 5.2. Recommendations for application writers . . . . . . . . . 18 89 5.3. Recommendation for future work . . . . . . . . . . . . . 18 90 6. A Note on Port Mapping . . . . . . . . . . . . . . . . . . . . 18 91 7. Security Considerations . . . . . . . . . . . . . . . . . . . 19 92 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 93 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 20 94 10. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . 20 95 10.1. Changes Between draft-mrw-behave-nat66-00 and -01 . . . . 20 96 10.2. Changes between *behave-nat66-01 and -02 . . . . . . . . 21 97 10.3. Changes between *nat66-00 and *nat66-01 . . . . . . . . . 21 98 10.4. Changes between *nat66-01 and *nat66-02 . . . . . . . . . 21 99 10.5. Changes between *nat66-02 and *nat66-03 . . . . . . . . . 22 100 10.6. Changes between *nat66-03 and *nat66-04 . . . . . . . . . 22 101 10.7. Changes between *nat66-04 and *nat66-05 . . . . . . . . . 22 102 10.8. Changes between *nat66-05 and *nat66-06 . . . . . . . . . 22 103 10.9. Changes between *nat66-06 and *nat66-07 . . . . . . . . . 22 104 10.10. Changes between *nat66-07 and *nat66-08 . . . . . . . . . 23 105 10.11. Changes up to *nat66-10 . . . . . . . . . . . . . . . . . 23 106 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23 107 11.1. Normative References . . . . . . . . . . . . . . . . . . 23 108 11.2. Informative References . . . . . . . . . . . . . . . . . 23 109 Appendix A. Why GSE? . . . . . . . . . . . . . . . . . . . . . . 25 110 Appendix B. Verification code . . . . . . . . . . . . . . . . . . 27 111 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 34 113 1. Introduction 115 This document describes a stateless IPv6-to-IPv6 Network Prefix 116 Translation (NPTv6) function, designed to provide address 117 independence to the edge network. It is transport-agnostic with 118 respect to transports that don't checksum the IP header, such as 119 SCTP, and to transports that use the TCP/UDP/DCCP pseudo-header and 120 checksum [RFC1071]. 122 This has several ramifications: 124 o Any security benefit that NAPT44 might offer is not present in 125 NPTv6, necessitating the use of a firewall to obtain those 126 benefits if desired. An example of such a firewall is described 127 in [RFC6092]. 129 o End to end reachability is preserved, although the address used 130 "inside" the edge network differs from the address used "outside" 131 the edge network. This has implications for application referrals 132 and other uses of Internet layer addresses. 134 o If there are multiple identically-configured prefix translators 135 between two networks, there is no need for them to exchange 136 dynamic state, as there is no dynamic state - the algorithmic 137 translation will be identical across each of them. The network 138 can therefore asymmetrically route, load-share, and fail-over 139 among them without issue. 141 o Since translation is 1:1 at the network layer, there is no need to 142 modify port numbers or other transport parameters. 144 o TCP sessions that authenticate peers using the The TCP 145 Authentication Option [RFC5925] cannot have their addresses 146 translated, as the addresses are used in the calculation of the 147 Message Authentication Code. 149 o Applications using the Internet Key Exchange Protocol Version 2 150 (IKEv2) [RFC5996] should, at least in theory, detect the presence 151 of the translator; while no NAT traversal solution is required, 152 [RFC5996] would require such sessions to use UDP. 154 1.1. What is Address Independence? 156 For the purposes of this document, IPv6 Address Independence consists 157 of the following set of properties: 159 From the perspective of the edge network: 161 * The IPv6 addresses used inside the local network (for 162 interfaces, access lists, and logs) do not need to be 163 renumbered if the global prefix(es) assigned for use by the 164 edge network are changed. 166 * The IPv6 addresses used inside the edge network (for 167 interfaces, access lists, and logs) or within other upstream 168 networks (such as when multihoming) do not need to be 169 renumbered when a site adds, drops, or changes upstream 170 networks. 172 * It is not necessary for an administration to convince an 173 upstream network to route its internal IPv6 prefixes, or for it 174 to advertise prefixes derived from other upstream networks into 175 it. 177 * Unless it wants to optimize routing between multiple upstream 178 networks in the process of multihoming, there is therefore no 179 need for a BGP exchange with the upstream network. 181 From the perspective of the upstream network: 183 * IPv6 addresses used by the edge network are guaranteed to have 184 a provider-allocated prefix, eliminating the need and concern 185 for BCP 38 [RFC2827] ingress filtering and the advertisement of 186 customer-specific prefixes. 188 Thus, address independence has ramifications for the edge network, 189 networks it directly connects with (especially its upstream 190 networks), and for the Internet as a whole. The desire for address 191 independence has been a primary driver for IPv4 NAT deployment in 192 medium to large-sized enterprise networks, including NAT deployments 193 in enterprises that have plenty of IPv4 provider-independent address 194 space (from IPv4 "swamp space"). It has also been a driver for edge 195 networks to become members of Regional Internet Registry (RIR) 196 communities, seeking to obtain BGP Autonomous System Numbers and 197 provider-independent prefixes, and as a result has been one of the 198 drivers of the explosion of the IPv4 route table. Service providers 199 have stated that the lack of address independence from their 200 customers has been a negative incentive to deployment, due to the 201 impact of customer routing expected in their networks. 203 The Local Network Protection [RFC4864] document discusses a related 204 concept called "Address Autonomy" as a benefit of NAPT44. [RFC4864] 205 indicates that address autonomy can be achieved by the simultaneous 206 use of global addresses on all nodes within a site that need external 207 connectivity, and Unique Local Addresses (ULAs) [RFC4193] for all 208 internal communication. However, this solution fails to meet the 209 requirement for address independence, because if an ISP renumbering 210 event occurs, all of the hosts, routers, DHCP servers, ACLs, 211 firewalls and other internal systems that are configured with global 212 addresses from the ISP will need to be renumbered before global 213 connectivity is fully restored. 215 The use of IPv6 Provider Independent (PI) addresses has also been 216 suggested as a means to fulfill the address independence requirement. 217 However, this solution requires that an enterprise qualify to receive 218 a PI assignment and persuade their ISP to install specific routes for 219 the enterprise's PI addresses. There are a number of practical 220 issues with this approach, especially if there is a desire to route 221 to a number of geographically and topologically diverse set of sites, 222 which can sometimes involve coordinating with several ISPs to route 223 portions of a single PI prefix. These problems have caused numerous 224 enterprises with plenty of IPv4 swamp space to choose to use IPv4 NAT 225 for part, or substantially all, of their internal network instead of 226 using their provider-independent address space. 228 1.2. NPTv6 Applicability 230 NPTv6 provides a simple and compelling solution to meet the Address 231 Independence requirement in IPv6. The address independence benefit 232 stems directly from the translation function of the network prefix 233 translator. To avoid as many of the issues associated with NAPT44 as 234 possible, NPTv6 is defined to include a two-way, checksum-neutral, 235 algorithmic translation function, and nothing else. 237 The fact that NPTv6 does not map ports and is checksum-neutral avoids 238 the need for a NPTv6 Translator to re-write transport layer headers. 239 This makes it feasible to deploy new or improved transport layer 240 protocols without upgrading NPTv6 Translators. Similarly, since 241 NPTv6 does not re-write transport-layer headers, NPTv6 will not 242 interfere with encryption of the full IP payload in many cases. 244 The default NPTv6 address mapping mechanism is purely algorithmic, so 245 NPTv6 translators do not need to maintain per-node or per-connection 246 state, allowing deployment of more robust and adaptive networks than 247 can be deployed using NAPT44. Since the default NPTv6 mapping can be 248 performed in either direction, it does not interfere with inbound 249 connection establishment, thus allowing internal nodes to participate 250 in direct Peer-to-Peer applications without the application layer 251 overhead one finds in many IPv4 Peer-to-Peer applications. 253 Although NPTv6 compares favorably to NAPT44 in several ways, it does 254 not eliminate all of the architectural problems associated with IPv4 255 NAT, as described in [RFC2993]. NPTv6 involves modifying IP headers 256 in transit, so it is not compatible with security mechanisms, such as 257 the IPsec Authentication Header, that provide integrity protection 258 for the IP header. NPTv6 may interfere with the use of application 259 protocols that transmit IP addresses in the application-specific 260 portion of the IP packet. These applications currently require 261 application layer gateways (ALGs) to work correctly through NAPT44 262 devices, and similar ALGs may be required for these applications to 263 work through NPTv6 Translators. The use of separate internal and 264 external prefixes creates complexity for DNS deployment, due the 265 desire for internal nodes to communicate with other internal nodes 266 using internal addresses, while external nodes need to obtain 267 external addresses to communicate with the same nodes. This 268 frequently results in the deployment of "split DNS", which may add 269 complexity to network configuration. 271 The choice of address within the edge network bears consideration. 272 One could use a ULA, which maximizes address independence. That 273 could also be considered a misuse of the ULA; if the expectation is 274 that a ULA prevents access to a system from outside the range of the 275 ULA, NPTv6 overrides that. On the other hand, the administration is 276 aware that it has made that choice, and could if it desired deploy a 277 second ULA for the purpose of privacy; the only prefix that will be 278 translated is one that has a NPTv6 Translator configured to translate 279 to or from it. Also, using any other global scope address format 280 makes one either obtain a PI prefix or be at the mercy of the agency 281 from which it was allocated. 283 There are significant technical impacts associated with the 284 deployment of any prefix translation mechanism, including NPTv6, and 285 we strongly encourage anyone who is considering the implementation or 286 deployment of NPTv6 to read [RFC4864], and to carefully consider the 287 alternatives described in that document, some of which may cause 288 fewer problems than NPTv6. 290 2. NPTv6 Overview 292 NPTv6 may be implemented in an IPv6 router to map one IPv6 address 293 prefix to another IPv6 prefix as each IPv6 packet transits the 294 router. A router that implements a NPTv6 prefix translation function 295 is referred to as an NPTv6 Translator. 297 2.1. NPTv6: the simplest case 299 In its simplest form, a NPTv6 Translator interconnects two network 300 links, one of which is an "internal" network link attached to a leaf 301 network within a single administrative domain, and the other of which 302 is an "external" network with connectivity to the global Internet. 303 All of the hosts on the internal network will use addresses from a 304 single, locally-routed prefix, and those addresses will be translated 305 to/from addresses in a globally-routable prefix as IP packets transit 306 the NPTv6 Translator. The lengths of these two prefixes will be 307 functionally the same; if they differ, the longer of the two will 308 limit the ability to use subnets in the shorter. 310 External Network: Prefix = 2001:0DB8:0001:/48 311 -------------------------------------- 312 | 313 | 314 +-------------+ 315 | NPTv6 | 316 | Translator | 317 +-------------+ 318 | 319 | 320 -------------------------------------- 321 Internal Network: Prefix = FD01:0203:0405:/48 323 Figure 1: A simple translator 325 Figure 1 shows a NPTv6 Translator attached to two networks. In this 326 example, the internal network uses IPv6 Unique Local Addresses (ULAs) 327 [RFC4193] to represent the internal IPv6 nodes, and the external 328 network uses globally routable IPv6 addresses to represent the same 329 nodes. 331 When a NPTv6 Translator forwards packets in the "outbound" direction, 332 from the internal network to the external network, NPTv6 overwrites 333 the IPv6 source prefix (in the IPv6 header) with a corresponding 334 external prefix. When packets are forwarded in the "inbound" 335 direction, from the external network to the internal network, the 336 IPv6 destination prefix is overwritten with a corresponding internal 337 prefix. Using the prefixes shown in the diagram above, as an IP 338 packet passes through the NPTv6 Translator in the outbound direction, 339 the source prefix (FD01:0203:0405:/48) will be overwritten with the 340 external prefix (2001:0DB8:0001:/48). In an inbound packet, the 341 destination prefix (2001:0DB8:0001:/48) will be overwritten with the 342 internal prefix (FD01:0203:0405:/48). In both cases, it is the local 343 IPv6 prefix that is overwritten; the remote IPv6 prefix remains 344 unchanged. Nodes on the internal network are said to be "behind" the 345 NPTv6 Translator. 347 2.2. NPTv6 between peer networks 349 NPTv6 can also be used between two private networks. In these cases, 350 both networks may use ULA prefixes, with each subnet in one network 351 mapped into a corresponding subnet in the other network, and vice 352 versa. Or, each network may use ULA prefixes for internal 353 addressing, and global unicast addresses on the other network. 355 Internal Prefix = FD01:4444:5555:/48 356 -------------------------------------- 357 V | External Prefix 358 V | 2001:0DB8:6666:/48 359 V +---------+ ^ 360 V | NPTv6 | ^ 361 V | Device | ^ 362 V +---------+ ^ 363 External Prefix | ^ 364 2001:0DB8:0001:/48 | ^ 365 -------------------------------------- 366 Internal Prefix = FD01:0203:0405:/48 368 Figure 2: Flow of Information in Translation 370 2.3. NPTv6 redundnacy and load-sharing 372 In some cases, more than one NPTv6 Translator may be attached to a 373 network, as show in Figure 3. In such cases, NPTv6 Translators are 374 configured with the same internal and external prefixes. Since there 375 is only one translation, even though there are multiple translators, 376 they map only one external address (prefix and IID) to the internal 377 address. 379 External Network: Prefix = 2001:0DB8:0001:/48 380 -------------------------------------- 381 | | 382 | | 383 +-------------+ +-------------+ 384 | NPTv6 | | NPTv6 | 385 | Translator | | Translator | 386 | #1 | | #2 | 387 +-------------+ +-------------+ 388 | | 389 | | 390 -------------------------------------- 391 Internal Network: Prefix = FD01:0203:0405:/48 393 Figure 3: Parallel Translators 395 2.4. NPTv6 multihoming 397 External Network #1: External Network #2: 398 Prefix = 2001:0DB8:0001:/48 Prefix = 2001:0DB8:5555:/48 399 --------------------------- -------------------------- 400 | | 401 | | 402 +-------------+ +-------------+ 403 | NPTv6 | | NPTv6 | 404 | Translator | | Translator | 405 | #1 | | #2 | 406 +-------------+ +-------------+ 407 | | 408 | | 409 -------------------------------------- 410 Internal Network: Prefix = FD01:0203:0405:/48 412 Figure 4: Parallel Translators with different upstream networks 414 When multihoming, NPTv6 Translators are attached to an internal 415 network, as show in Figure 4, but connected to different external 416 networks. In such cases, NPTv6 Translators are configured with the 417 same internal prefix, but different external prefixes. Since there 418 are multiple translations, they map multiple external addresses 419 (prefix and IID) to the common internal address. A system within the 420 edge network is unable to determine which external address it is 421 using apart from services such as STUN. 423 Multihoming in this sense has one negative feature as compared with 424 multihoming with a provider-independent address; when routes change 425 between NPTv6 Translators, since the upstream network changes, the 426 translated prefix can change. This would case sessions and referrals 427 dependent on it to fail as well. This is not expected to be a major 428 real issue, however, in networks where routing is generally stable. 430 2.5. Mapping with No Per-Flow State 432 When NPTv6 is used as described in this document, no per-node or per- 433 flow state is maintained in the NPTv6 Translator. Both inbound and 434 outbound packets are translated algorithmically, using only 435 information found in the IPv6 header. Due to this property, NPTv6's 436 two-way, algorithmic address mapping can support both outbound and 437 inbound connection establishment without the need for state-priming 438 or rendezvous mechanisms, or the maintenance of mapping state. This 439 is a significant improvement over NAPT44 devices, but it also has 440 significant security implications which are described in Section 7. 442 2.6. Checksum-Neutral Mapping 444 When a change is made to one of the IP header fields in the IPv6 445 pseudo-header checksum (such as one of the IP addresses), the 446 checksum field in the transport layer header may become invalid. 447 Fortunately, an incremental change in the area covered by the 448 Internet standard checksum [RFC1071] will result in a well-defined 449 change to the checksum value [RFC1624]. So, a checksum change caused 450 by modifying part of the area covered by the checksum can be 451 corrected by making a complementary change to a different 16-bit 452 field covered by the same checksum. 454 The NPTv6 mapping mechanisms described in this document are checksum- 455 neutral, which means that they result in IP headers that will 456 generate the same IPv6 pseudo-header checksum when the checksum is 457 calculated using the standard Internet checksum algorithm [RFC1071]. 458 Any changes that are made during translation of the IPv6 prefix are 459 offset by changes to other parts of the IPv6 address. This results 460 in transport layers that use the Internet checksum (such as TCP and 461 UDP) calculating the same IPv6 pseudo header checksum for both the 462 internal and external forms of the same packet, which avoids the need 463 for the NPTv6 Translator to modify those transport layer headers to 464 correct the checksum value. 466 As noted in Section 4.2, this mapping results in an edge network 467 using a /48 external prefix to be unable to use subnet 0xFFFF. 469 3. NPTv6 Algorithmic Specification 471 The [RFC4291] IPv6 Address is reproduced for clarity in Figure 5. 473 0 15 16 31 32 47 48 63 64 79 80 95 96 111 112 127 474 +-------+-------+-------+-------+-------+-------+-------+-------+ 475 | Routing Prefix | Subnet| Interface Identifier (IID) | 476 +-------+-------+-------+-------+-------+-------+-------+-------+ 478 Figure 5: Enumeration of the IPv6 Address [RFC4291] 480 3.1. NPTv6 configuration calculations 482 When an NPTv6 Translation function is configured, it is configured 483 with 485 o one or more "internal" interfaces with their "internal" routing 486 domain prefixes, and 488 o one or more "external" interfaces with their "external" routing 489 domain prefixes. 491 In the simple case, there is one of each. If a single router 492 provides NPTv6 translation services between a multiplicity of domains 493 (as might be true when multihoming), each internal/external pair must 494 be thought of as a separate NPTv6 Translator from the perspective of 495 this specification. 497 When an NPTv6 Translator is configured, the translation function 498 first ensures that the internal and external prefixes are the same 499 length, if necessary by extending the shorter of the two with zeroes. 500 These two prefixes will be used in the prefix translation function 501 described in Section 3.2 and Section 3.3. 503 They are then zero-extended to /64, for the purposes of a 504 calculation. The translation function calculates the ones-complement 505 sum of the 16 bit words of the /64 external prefix and the /64 506 internal prefix. It then calculates the difference between these 507 values: internal minus external. This value, called the 508 "adjustment", is effectively constant for the lifetime of the NPTv6 509 Translator configuration, and used in per-packet processing. 511 3.2. NPTv6 translation, internal network to external network 513 When a datagram passes through the NPTv6 Translator from an internal 514 to an external network, its IPv6 Source Address is changed in two 515 ways: 517 o If the internal subnet number has no mapping, such as being 0xFFFF 518 or simply not mapped, discard the datagram. This SHOULD result in 519 an ICMP Destination Unreachable. 521 o The internal prefix is overwritten with the external prefix, in 522 effect subtracting the difference between the two checksums (the 523 adjustment) from the pseudo-header's checksum, and 525 o A 16-bit word of the address has the adjustment added to it using 526 one's complement arithmetic. If the result is 0xFFFF, it is 527 overwritten as zero. The choice of word is as specified in 528 Section 3.4 or Section 3.5 as appropriate. 530 3.3. NPTv6 translation, external network to internal network 532 When a datagram passes through the NPTv6 Translator from an external 533 to an internal network, its IPv6 Destination Address is changed in 534 two ways: 536 o The external prefix is overwritten with the internal prefix, in 537 effect adding the difference between the two checksums (the 538 adjustment) to the pseudoheader's checksum, and 540 o A 16-bit word of the address has the adjustment subtracted from it 541 (bitwise inverted and added to it) it using one's complement 542 arithmetic. If the result is 0xFFFF, it is overwritten as zero. 543 The choice of word is as specified in Section 3.4 or Section 3.5 544 as appropriate. 546 3.4. NPTv6 with a /48 or shorter prefix 548 When a NPTv6 Translator is configured with internal and external 549 prefixes that are 48 bits in length (a /48) or shorter, the 550 adjustment MUST be added to or subtracted from bits 48..63 of the 551 address. 553 This mapping results in no modification of the Interface Identifier 554 (IID), which is held in the lower half of the IPv6 address, so it 555 will not interfere with future protocols that may use unique IIDs for 556 node identification. 558 NPTv6 Translator implementations MUST implement the /48 mapping. 560 3.5. NPTv6 with a /49 or longer prefix 562 When a NPTv6 Translator is configured with internal and external 563 prefixes that are longer than 48 bits in length (such as a /52, /56, 564 or /60), the adjustment must be added to or subtracted from one of 565 the words in bits 64..79, 80..95, 96..111, or 112..127 of the 566 address. While the choice of word is immaterial as long as it is 567 consistent, for consistency's sake, these words MUST be inspected in 568 that sequence, and the first that is not initially 0xFFFF chosen. 570 NPTv6 Translator implementations SHOULD implement the mapping for 571 longer prefixes. 573 3.6. /48 Prefix Mapping Example 575 For the network shown in Figure 1, the Internal Prefix is FD01:0203: 576 0405:/48, and the External Prefix is 2001:0DB8:0001:/48 578 If a node with internal address FD01:0203:0405:0001::1234 sends an 579 outbound packet through the NPTv6 Translator, the resulting external 580 address will be 2001:0DB8:0001:D550::1234. The resulting address is 581 obtained by calculating the checksum of both the internal and 582 external 48-bit prefixes, subtracting the internal prefix from the 583 external prefix using one's complement arithmetic to calculate the 584 "adjustment", and adding the adjustment to the 16-bit subnet field 585 (in this case 0x0001). 587 To show the work: 589 The one's complement checksum of FD01:0203:0405 is 0xFCF5. The one's 590 complement checksum of 2001:0DB8:0001 is 0xD245. Using one's 591 complement arithmetic, 0xD245 - 0xFCF5 = 0xD54F. The subnet in the 592 original packet is 0x0001. Using one's complement arithmetic, 0x0001 593 + 0xD54F = 0xD550. Since 0xD550 != 0xFFFF, it is not changed to 594 0x0000. 596 So, the value 0xD550 is written in the 16-bit subnet area, resulting 597 in a mapped external address of 2001:0DB8:0001:D550::1234. 599 When a response packet is received, it will contain the destination 600 address 2001:0DB8:0001:D550::0001, which will be mapped using the 601 inverse mapping algorithm, back to FD01:0203:0405:0001::1234. 603 In this case, the difference between the two prefixes will be 604 calculated as follows: 606 Using one's complement arithmetic, 0xFCF5 - 0xD245 = 0x2AB0. The 607 subnet in the original packet = 0xD550. Using one's complement 608 arithmetic, 0xD550 + 0x2AB0 = 0x0001. Since 0x0001 != 0xFFFF, it is 609 not changed to 0x0000. 611 So the value 0x0001 is written into the subnet field, and the 612 internal value of the subnet field is properly restored. 614 3.7. Address Mapping for Longer Prefixes 616 If the prefix being mapped is longer than 48 bits, the algorithm is 617 slightly more complex. A common case will be that the internal and 618 external prefixes are of different length. In such a case, the 619 shorter prefix is zero-extended to the length of the longer as 620 described in Section 3.1 for the purposes of overwriting the prefix. 621 Then, they are both zero-extended to 64 bits to facilitate one's 622 complement arithmetic. The "adjustment" is calculated using those 64 623 bit prefixes. 625 For example if the internal prefix is a /48 ULA and the external 626 prefix is a /56 provider-allocated prefix, the ULA becomes a /56 with 627 zeros in bits 48..55. For purposes of one's complement arithmetic, 628 they are then both zero-extended to 64 bits. A side-effect of this 629 is that a subset of the subnets possible in the shorter prefix are 630 untranslatable. While the security value of this is debatable, the 631 administration may choose to use them for subnets that it knows need 632 no external accessibility. 634 We then find the first word in the IID that does not have the value 635 0xFFFF, trying bits 64..79, and then 80..95, 96..111, and finally 636 112..127. We perform the same calculation (with the same proof of 637 correctness) as in Section 3.6, but applying it to that word. 639 Although any 16-bit portion of an IPv6 IID could contain 0xFFFF, an 640 IID of all-ones is a reserved anycast identifier that should not be 641 used on the network [RFC2526]. If a NPTv6 Translator discovers a 642 packet with an IID of all-zeros while performing address mapping, 643 that packet MUST be dropped, and an ICMPv6 Parameter Problem error 644 SHOULD be generated [RFC4443]. 646 Note: this mechanism does involve modification of the IID; it may not 647 be compatible with future mechanisms that use unique IIDs for node 648 identification. 650 4. Implications of Network Address Translator Behavioral Requirements 652 4.1. Prefix configuration and generation 654 NPTv6 Translators MUST support manual configuration of internal and 655 external prefixes, and MUST NOT place any restrictions on those 656 prefixes except that they be valid IPv6 unicast prefixes as described 657 in [RFC4291]. They MAY also support random generation of ULA 658 addresses on command. Since the most common place anticipated for 659 the implementation of an NPTv6 Translator is a CPE router, the reader 660 is urged to consider the requirements of 661 [I-D.ietf-v6ops-ipv6-cpe-router]. 663 4.2. Subnet numbering 665 For reasons detailed in Appendix B, a network using NPTv6 Translation 666 and a /48 external prefix MUST NOT use the value 0xFFFF to designate 667 a subnet that it expects to be translated. 669 4.3. NAT Behavioral Requirements 671 NPTv6 Translators MUST support hairpinning behavior, as defined in 672 the NAT Behavioral Requirements for UDP document [RFC4787]. This 673 means that when a NPTv6 Translator receives a packet on the internal 674 interface that has a destination address that matches the site's 675 external prefix, it will translate the packet and forward it 676 internally. This allows internal nodes to reach other internal nodes 677 using their external, global addresses when necessary. 679 Conceptually, the datagram leaves the domain (is translated as 680 described in Section 3.2), and returns (is again translated as 681 described in Section 3.3). As a result, the datagram exchange will 682 be through the NPTv6 Translator in both directions for the lifetime 683 of the session. The alternative would be to require the NPTv6 684 Translator to drop the datagram, forcing the sender to use the 685 correct internal prefix for its peer. Performing only the external- 686 to-internal translation results in the datagram being sent from the 687 untranslated internal address of the source to the translated and 688 therefore internal address of its peer, which would enable the 689 session to bypass the NPTv6 Translator for future datagrams. It 690 would also mean that the original sender would be unlikely to 691 recognize the response when it arrived. 693 Because NPTv6 does not perform port mapping and uses a one-to-one, 694 reversible mapping algorithm, none of the other NAT behavioral 695 requirements apply to NPTv6. 697 5. Implications for Applications 699 NPTv6 Translation does not create several of the problems known to 700 exist with other kinds of NATs and discussed in [RFC2993]. In 701 particular: NPTv6 Translation is stateless, so a "reset" or brief 702 outage of an NPTv6 Translator does not break connections that 703 traverse the translation function, and if multiple NPTv6 Translators 704 exist between the same two networks, load can shift or be dynamically 705 loaded-shared among them. Also, an NPTv6 Translator does not 706 aggregate traffic for several hosts/interfaces behind a lesser number 707 of external addresses, so there is no inherent expectation for an 708 NPTv6 Translator to block new inbound flows from external hosts, and 709 no issue with a filter or blacklist associated with one prefix within 710 the domain affecting another. A firewall can of course be used in 711 conjunction with NPTv6 Translator; this would allow the network 712 administrator more flexibility to specify security policy than would 713 be possible with a traditional NAT. 715 However, NPTv6 Translation does create difficulties for some kinds of 716 applications. e.g.: 718 o An application instance "behind" an NPTv6 Translator will see a 719 different address for its connections than its peers "outside" the 720 NPTv6 Translator. 722 o An application instance "outside" an NPTv6 Translator will see a 723 different address for its connections than any peers which are 724 "behind" an NPTv6 Translator. 726 o An application instance wishing to establish communication with a 727 peer "behind" an NPTv6 Translator may need to use a different 728 address to reach that peer depending on whether the instance is 729 behind the same NPTv6 Translator or external to it. If the NPTv6 730 Translator implements hairpinning (Section 4.3), it suffices for 731 applications to always use their external addresses. However, 732 this creates inefficiencies in the local network and may also 733 complicate implementation of the NPTv6 Translator. [RFC3484] also 734 would prefer the private address in such a case in order to reduce 735 those inefficiencies. 737 o An application instance which moves from a realm "behind" an NPTv6 738 Translator to a realm that is "outside" the network, or vice 739 versa, may find that it is no longer able to reach its peers at 740 the same addresses it was previously able to use. 742 o An application instance which is intermittently communicating with 743 a peer that moves from behind an NPTv6 Translator to "outside" of 744 it, or vice versa, may find that it is no longer able to reach 745 that peer at the same address that it had previously used. 747 Many, but not all, of the applications which are adversely affected 748 by NPTv6 Translation are those that do "referrals" - where an 749 application instance passes its own addresses, and/or addresses of 750 its peers, to other peers. (Some believe referrals are inherently 751 undesirable; others believe that they are necessary in some 752 circumstances. A discussion of the merits of referrals, or lack 753 thereof, is beyond the scope of this document.) 755 To some extent, the incidence of these difficulties can be reduced by 756 DNS hacks that attempt to expose addresses "behind" an NPTv6 757 Translator only to hosts which are also behind the same NPTv6 758 Translator; and perhaps also, to expose only the "internal" addresses 759 of hosts behind the NPTv6 Translator to other hosts behind the same 760 NPTv6 Translator. However, this cannot be a complete solution. A 761 full discussion of these issues is out of scope for this document, 762 but briefly: (a) reliance on DNS to solve this problem depends on 763 hosts always making queries from DNS servers in the same realm as 764 they are (or on DNS interception proxies, which create their own 765 problems), and on mobile hosts/applications not caching those 766 results; (b) reliance on DNS to solve this problem depends on network 767 administrators on all networks using such applications to reliably 768 and accurately maintain current DNS entries for every host using 769 those applications; and (c) reliance on DNS to solve this problem 770 depends on applications always using DNS names, even though they 771 often must run in environments where DNS names are not reliably 772 maintained for every host. Other issues are that there is often no 773 single distinguished name for a host, no reliable way for a host to 774 determine what DNS names are associated with it, and which names are 775 appropriate to use in which contexts. 777 5.1. Recommendation for network planners considering use of NPTv6 778 Translator 780 In light of the above, network planners considering the use of NPTv6 781 translation should carefully consider the kinds of applications that 782 they will need to run in the future, and determine whether the 783 address stability and provider independence benefits are consistent 784 with their application requirements. 786 5.2. Recommendations for application writers 788 Several mechanisms (e.g. STUN, TURN, ICE) have been used with 789 traditional IPv4 NAT to circumvent some of the limitations of such 790 devices. Similar mechanisms could also be applied to circumvent some 791 of the issues with NPTv6 Translator. However, all of these require 792 the assistance of an external server or a function co-located with 793 the translator that can tell an "internal" host what its "external" 794 addresses are. 796 5.3. Recommendation for future work 798 It might be desirable to define a general mechanism which would allow 799 hosts within a translation domain to determine their external 800 addresses and/or request that inbound traffic be permitted. If such 801 a mechanism were to be defined, it would ideally be general enough to 802 also accommodate other types of NAT likely to be encountered by IPV6 803 applications - in particular, IPv4/IPv6 Translation 804 [I-D.ietf-behave-v6v4-framework] [I-D.ietf-behave-dns64] 805 [I-D.ietf-behave-v6v4-xlate] [I-D.ietf-behave-v6v4-xlate-stateful] 806 [RFC6052]. For this and other reasons, such a mechanism is beyond 807 the scope of this document. 809 6. A Note on Port Mapping 811 In addition to overwriting IP addresses when packets are forwarded, 812 NAPT44 devices overwrite the source port number in outbound traffic, 813 and the destination port number in inbound traffic. This mechanism 814 is called "port mapping". 816 The major benefit of port mapping is that it allows multiple 817 computers to share a single IPv4 address. A large number of internal 818 IPv4 addresses (typically from one of the [RFC1918] private address 819 spaces) can be mapped into a single external, globally routable IPv4 820 address, with the local port number used to identify which internal 821 node should receive each inbound packet. This address amplification 822 feature is not generally foreseen as a necessity at this time. 824 Since port mapping requires re-writing a portion of the transport 825 layer header, it requires NAPT44 devices to be aware of all of the 826 transport protocols that they forward, thus stifling the development 827 of new and improved transport protocols and preventing the use of 828 IPsec encryption. Modifying the transport layer header is 829 incompatible with security mechanisms that encrypt the full IP 830 payload, and restricts the NAPT44 to forwarding transport layers that 831 use weak checksum algorithms that are easily recalculated in routers. 833 Since there is significant detriment caused by modifying transport 834 layer headers and very little, if any, benefit to the use of port 835 mapping in IPv6, NPTv6 Translators that comply with this 836 specification MUST NOT perform port mapping. 838 7. Security Considerations 840 When NPTv6 is deployed using either of the two-way, algorithmic 841 mappings defined in the document, it allows direct inbound 842 connections to internal nodes. While this can be viewed as a benefit 843 of NPTv6 vs. NAPT44, it does open internal nodes to attacks that 844 would be more difficult in a NAPT44 network. Although this situation 845 is not substantially worse, from a security standpoint, than running 846 IPv6 with no NAT, some enterprises may assume that a NPTv6 Translator 847 will offer similar protection to a NAPT44 device. 849 The port mapping mechanism in NAPT44 implementations require that 850 state be created in both directions. This has lead to an industry- 851 wide perception that NAT functionality is the same as a stateful 852 firewall. It is not. The translation function of the NAT only 853 creates dynamic state in one direction and has no policy. For this 854 reason, it is RECOMMENDED that NPTv6 Translators also implement 855 firewall functionality such as described in [RFC6092], with 856 appropriate configuration options including turning it on or off. 858 When [RFC4864] talks about randomizing the subnet identifier, the 859 idea is to make it harder for worms to guess a valid subnet 860 identifier at an advertised network prefix. This should not be 861 interpreted as endorsing concealing the subnet identifier behind the 862 obfuscating function of a translator such as NPTv6. [RFC4864] 863 specifically talks about how to obtain the desired properties of 864 concealment without using a translator. Topology hiding when using 865 NAT is often ineffective in environments where the topology is 866 visible in application layer messaging protocols such as DNS, SIP, 867 SMTP, etc. If the information were not available through the 868 application layer, [RFC2993] would not be valid. 870 Due to the potential interactions with IKEv2/IPsec NAT traversal, it 871 would be valuable to test interactions of NPTv6 with various aspects 872 of current-day IKEv2/IPsec NAT traversal. 874 8. IANA Considerations 876 This document has no IANA considerations. 878 9. Acknowledgements 880 The checksum-neutral algorithmic address mapping described in this 881 document is based on e-mail written by Iljtsch Van Beijnum. 883 The following people provided advice or review comments that 884 substantially improved this document: Allison Mankin, Christian 885 Huitema, Dave Thaler, Ed Jankiewicz, Eric Kline, Iljtsch Van Beijnum, 886 Jari Arkko, Keith Moore, Mark Townsley, Merike Kaeo, Ralph Droms, 887 Remi Depres, Steve Blake, and Tony Hain. 889 This document was written using the xml2rfc tool described in RFC 890 2629 [RFC2629]. 892 10. Change Log 894 This section should be removed by the RFC Editor. 896 10.1. Changes Between draft-mrw-behave-nat66-00 and -01 898 There were several minor changes made between the *behave-nat66-00 899 and -01 versions of this draft: 901 o Added Fred Baker as a co-author. 903 o Minor arithmetic corrections. 905 o Added AH to paragraph on NAT security issues. 907 o Added additional NAT topologies to overview (diagrams TBD). 909 10.2. Changes between *behave-nat66-01 and -02 911 There were further changes made between *behave-nat66-01 and -02: 913 o Removed topology hiding mechanism. 915 o Added diagrams. 917 o Made minor updates based on mailing list feedback. 919 o Added discussion of IPv6 SAF document. 921 o Added applicability section. 923 o Added discussion of Address Independence requirement. 925 o Added hairpinning requirement and discussion of applicability of 926 other NAT behavioral requirements. 928 10.3. Changes between *nat66-00 and *nat66-01 930 There were further changes made between nat66-01 and nat66-02: 932 o Added mapping for prefixes longer than /48. 934 o Change draft name to remove reference to the behave WG. 936 o Resolved various open issues and fixed typos. 938 10.4. Changes between *nat66-01 and *nat66-02 940 o Change the acronym "NAT66" to "NPTv6", so people don't read "NAT" 941 and MEGO. 943 o Change the term used to refer to the function from "NAT66 device" 944 to "NPTv6 Translator". It's not a "device" function, it's a 945 function that is applied between two interfaces. Consider a 946 router with two upstreams and two legs in the local network; it 947 will not translate between the local legs, but will translate to 948 and from each upstream, and be configured differently for each of 949 the two ISPs. 951 o Comment specifically on the security aspects. 953 o Comment specifically on the application issues raised on this 954 list. 956 o Comment specifically on multihoming, load-sharing, and asymmetric 957 routing. 959 o Spell out the hairpinning requirement and its implications. 961 o Spell out the service provider side of Address Independence. 963 o 00 focuses on the edge's view 965 o Detail the algorithm in a manner clearer to the implementor (I 966 think) 968 o Spell out the case for GSE-style DMZs between the edge and the 969 transit network, which is about the implications for the global 970 routing table. 972 o Refer to [RFC6092] as a CPE firewall description. 974 10.5. Changes between *nat66-02 and *nat66-03 976 o Added an appendix on Verification code 978 o Various minor markups in response to Ralph Droms 980 10.6. Changes between *nat66-03 and *nat66-04 982 o Markups in response to Christian Huitema, mostly surrounding the 983 issue of subnet 0xFFFF. 985 o Refer to [I-D.ietf-v6ops-ipv6-cpe-router] for CPE router 986 requirements. 988 10.7. Changes between *nat66-04 and *nat66-05 990 o Update statistics in appendix A per BGP report of 17 December 2010 992 o Update security considerations using text supplied by Merike Kaeo. 994 10.8. Changes between *nat66-05 and *nat66-06 996 o restore a code snippet inadvertently removed in version -05 998 10.9. Changes between *nat66-06 and *nat66-07 1000 o Changed requested status to experimental 1002 o Incorporated comments from Eric Kline 1004 10.10. Changes between *nat66-07 and *nat66-08 1006 The section on Application Considerations was expanded after 1007 discussion with Keith Moore. 1009 10.11. Changes up to *nat66-10 1011 Address review comments during IETF Last Call and the Transport 1012 Directorate Review. 1014 11. References 1016 11.1. Normative References 1018 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1019 Requirement Levels", BCP 14, RFC 2119, March 1997. 1021 [RFC2526] Johnson, D. and S. Deering, "Reserved IPv6 Subnet Anycast 1022 Addresses", RFC 2526, March 1999. 1024 [RFC4193] Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast 1025 Addresses", RFC 4193, October 2005. 1027 [RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing 1028 Architecture", RFC 4291, February 2006. 1030 [RFC4443] Conta, A., Deering, S., and M. Gupta, "Internet Control 1031 Message Protocol (ICMPv6) for the Internet Protocol 1032 Version 6 (IPv6) Specification", RFC 4443, March 2006. 1034 [RFC4787] Audet, F. and C. Jennings, "Network Address Translation 1035 (NAT) Behavioral Requirements for Unicast UDP", BCP 127, 1036 RFC 4787, January 2007. 1038 11.2. Informative References 1040 [GSE] O'Dell, M., "GSE - An Alternate Addressing Architecture 1041 for IPv6", February 1997, 1042 . 1044 [I-D.ietf-behave-dns64] 1045 Bagnulo, M., Sullivan, A., Matthews, P., and I. Beijnum, 1046 "DNS64: DNS extensions for Network Address Translation 1047 from IPv6 Clients to IPv4 Servers", 1048 draft-ietf-behave-dns64-11 (work in progress), 1049 October 2010. 1051 [I-D.ietf-behave-v6v4-framework] 1052 Baker, F., Li, X., Bao, C., and K. Yin, "Framework for 1053 IPv4/IPv6 Translation", 1054 draft-ietf-behave-v6v4-framework-10 (work in progress), 1055 August 2010. 1057 [I-D.ietf-behave-v6v4-xlate] 1058 Li, X., Bao, C., and F. Baker, "IP/ICMP Translation 1059 Algorithm", draft-ietf-behave-v6v4-xlate-23 (work in 1060 progress), September 2010. 1062 [I-D.ietf-behave-v6v4-xlate-stateful] 1063 Bagnulo, M., Matthews, P., and I. Beijnum, "Stateful 1064 NAT64: Network Address and Protocol Translation from IPv6 1065 Clients to IPv4 Servers", 1066 draft-ietf-behave-v6v4-xlate-stateful-12 (work in 1067 progress), July 2010. 1069 [I-D.ietf-v6ops-ipv6-cpe-router] 1070 Singh, H., Beebee, W., Donley, C., Stark, B., and O. 1071 Troan, "Basic Requirements for IPv6 Customer Edge 1072 Routers", draft-ietf-v6ops-ipv6-cpe-router-09 (work in 1073 progress), December 2010. 1075 [RFC1071] Braden, R., Borman, D., Partridge, C., and W. Plummer, 1076 "Computing the Internet checksum", RFC 1071, 1077 September 1988. 1079 [RFC1624] Rijsinghani, A., "Computation of the Internet Checksum via 1080 Incremental Update", RFC 1624, May 1994. 1082 [RFC1918] Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G., and 1083 E. Lear, "Address Allocation for Private Internets", 1084 BCP 5, RFC 1918, February 1996. 1086 [RFC2629] Rose, M., "Writing I-Ds and RFCs using XML", RFC 2629, 1087 June 1999. 1089 [RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: 1090 Defeating Denial of Service Attacks which employ IP Source 1091 Address Spoofing", BCP 38, RFC 2827, May 2000. 1093 [RFC2993] Hain, T., "Architectural Implications of NAT", RFC 2993, 1094 November 2000. 1096 [RFC3484] Draves, R., "Default Address Selection for Internet 1097 Protocol version 6 (IPv6)", RFC 3484, February 2003. 1099 [RFC4864] Van de Velde, G., Hain, T., Droms, R., Carpenter, B., and 1100 E. Klein, "Local Network Protection for IPv6", RFC 4864, 1101 May 2007. 1103 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 1104 Authentication Option", RFC 5925, June 2010. 1106 [RFC5996] Kaufman, C., Hoffman, P., Nir, Y., and P. Eronen, 1107 "Internet Key Exchange Protocol Version 2 (IKEv2)", 1108 RFC 5996, September 2010. 1110 [RFC6052] Bao, C., Huitema, C., Bagnulo, M., Boucadair, M., and X. 1111 Li, "IPv6 Addressing of IPv4/IPv6 Translators", RFC 6052, 1112 October 2010. 1114 [RFC6092] Woodyatt, J., "Recommended Simple Security Capabilities in 1115 Customer Premises Equipment (CPE) for Providing 1116 Residential IPv6 Internet Service", RFC 6092, 1117 January 2011. 1119 Appendix A. Why GSE? 1121 For the purpose of this discussion, let us over-simplify the 1122 Internet's structure by distinguishing between two broad classes of 1123 networks: transit and edge. A "transit network", in this context, is 1124 a network that provides connectivity services to other networks. Its 1125 AS number may show up in a non-final position in BGP AS paths, or in 1126 the case of mobile and residential broadband networks, it may offer 1127 network services to smaller networks that can't justify RIR 1128 membership. An "edge network", in contrast, is any network that is 1129 not a transit network; it is the ultimate customer, and while it 1130 provides internal connectivity for its own use, it is in other 1131 respects is a consumer of transit services. In terms of routing, a 1132 network in the transit domain generally needs some way to make 1133 choices about how it routes to other networks; an edge network is 1134 generally quite satisfied with a simple default route. 1136 The [GSE] proposal, and as a result this proposal (which is similar 1137 to GSE in most respects and inspired by it), responds directly to 1138 current concerns in the RIR communities. Edge networks are used to 1139 an environment in IPv4 in which their addressing is disjoint from 1140 that of their upstream transit networks; it is either provider 1141 independent, or a network prefix translator makes their external 1142 address distinct from their internal address, and they like the 1143 distinction. In IPv6, there is a mantra that edge network addresses 1144 should be derived from their upstream, and if they have multiple 1145 upstreams, edge networks are expected to design their networks to use 1146 all of those prefixes equivalently. They see this as unnecessary and 1147 unwanted operational complexity, and are as a result pushing very 1148 hard in the RIR communities for provider independent addressing. 1150 Widespread use of provider independent addressing has a natural and 1151 perhaps unavoidable side-effect that is likely to be very expensive 1152 in the long term. It means that the routing table will enumerate the 1153 networks at the edge of the transit domain, the edge networks, rather 1154 than enumerating the transit domain. Per the BGP Update Report of 17 1155 December 2010, there are currently over 36,000 Autonomous Systems 1156 being advertised in BGP, of which over 15,000 advertise only one 1157 prefix. There are in the neighborhood of 5000 AS's that show up in a 1158 non-final position in AS paths, and perhaps another 5000 networks 1159 whose AS numbers are terminal in more than one AS path. In other 1160 words, we have prefixes for some 36,000 transit and edge networks in 1161 the route table now, many of which arguably need an Autonomous System 1162 number only for multihoming. Current estimates suggest that we could 1163 easily see that be on the order of 10,000,000 within fifteen years. 1164 Tens of thousands of entries in the 36,264 Autonomous Systems being 1165 advertised in BGP, of which 31,137 provide no visible transit service 1166 to another AS, and 23,595 of those are visible in only one AS path 1167 (have only one upstream network). In addition, of the 36,264 AS's in 1168 the world, 15,439 advertise only a single prefix. In other words, we 1169 have prefixes for some 36,000 transit and edge networks in the route 1170 table now, many of which arguably need an Autonomous System number 1171 only for multihoming. However, the vast majority of networks (2/3) 1172 having the tools necessary to multihome are not visibly doing so, and 1173 would be well served by any solution that gives them address 1174 independence without the overhead of RIR membership and BGP routing. 1176 Current growth estimates suggest that we could easily see that be on 1177 the order of 10,000,000 within fifteen years. Tens of thousands of 1178 entries in the route table is very survivable; while our protocols 1179 and computers will likely do quite well with tens of millions of 1180 routes, the heat produced and power consumed by those routers, and 1181 the inevitable impact on the cost of those routers, is not a good 1182 outcome. To avoid having a massive and unscalable route table, we 1183 need to find a way that is politically acceptable and returns us to 1184 enumerating the transit domain, not the edge. 1186 There have been a number of proposals. As described, shim6 moves the 1187 complexity to the edge, and the edge is rebelling. Geographic 1188 addressing in essence forces ISPs to "own" geographic territory from 1189 a routing perspective, as otherwise there is no clue in the address 1190 as to what network a datagram should be delivered to in order to 1191 reach it. Metropolitan Addressing can imply regulatory authority, 1192 and even if it is implemented using internet exchange consortia, 1193 visits a great deal of complexity on the transit networks that 1194 directly serve the edge. The one that is likely to be most 1195 acceptable is any proposal that enables an edge network to be 1196 operationally independent of its upstreams, with no obligation to 1197 renumber when it adds, drops, or changes ISPs, and with no additional 1198 burden placed either on the ISP or the edge network as a result. 1199 From an application perspective, an additional operational 1200 requirement in the words of US NIST's Roadmap for the Smart Grid, is 1201 that 1203 "...the Network should enable an application in a particular 1204 domain to communicate with an application in any other domain in 1205 the information network, with proper management control over who 1206 and where applications can be interconnected." 1208 In other words, the structure of the network should allow for and 1209 enable appropriate access control, but the structure of the network 1210 should not inherently limit access. 1212 The GSE model, by statelessly translating the prefix between an edge 1213 network and its upstream transit network, accomplishes that with a 1214 minimum of fuss and bother. Stated in the simplest terms, it enables 1215 the edge network to behave as if it has a provider-independent prefix 1216 from a multihoming and renumbering perspective without the overhead 1217 of RIR membership or maintaining BGP connectivity, and it enables the 1218 transit networks to aggressively aggregate what are from their 1219 perspective provider-allocated customer prefixes, to maintain a 1220 rational-sized routing table. 1222 Appendix B. Verification code 1224 This non-normative appendix is presented as a proof of concept. It 1225 is in no sense optimized; for example, one's complement arithmetic is 1226 implemented in portable subroutines, where operational 1227 implementations might use one's complement arithmetic instructions 1228 through a pragma; such implementations probably need to explicitly 1229 force 0xFFFF to 0x0000, as the instruction will not. The original 1230 purpose of the code was to verify whether or not it was necessary to 1231 suppress 0xFFFF by overwriting with zero, and whether predicted 1232 issues with subnet numbering were real. 1234 The point is to 1236 o demonstrate that if one or the other representation of zero is not 1237 used in the word the checksum is updated in, the program maps 1238 inner and outer addresses in a manner that is, mathematically, 1:1 1239 and onto (each inner address maps to a unique outer address, and 1240 that outer address maps back to exactly the same inner address), 1241 and 1243 o give guidance on the suppression of 0xFFFF checksums. 1245 In short, in one's complement arithmetic, x-x=0, but will take the 1246 negative representation of zero. If 0xFFFF results are forced to the 1247 value 0x0000, as is recommended in [RFC1071], the word the checksum 1248 is adjusted in cannot be initially 0xFFFF, as on the return it will 1249 be forced to 0. If 0xFFFF results are not forced to the value 0x0000 1250 as is recommended in [RFC1071], the word the checksum is adjusted in 1251 cannot be initially 0, as on the return it will be calculated as 1252 0+(~0) = 0xFFFF. We chose to follow [RFC1071]'s recommendations, 1253 which implies a requirement to not use 0xFFFF as a subnet number in 1254 networks with a /48 external prefix. 1256 /* 1257 * Copyright (c) 2010 IETF Trust and the persons identified as 1258 * authors of the code. All rights reserved. Redistribution 1259 * and use in source and binary forms, with or without 1260 * modification, are permitted provided that the following 1261 * conditions are met: 1262 * 1263 * o Redistributions of source code must retain the above 1264 * copyright notice, this list of conditions and the 1265 * following disclaimer. 1266 * 1267 * o Redistributions in binary form must reproduce the above 1268 * copyright notice, this list of conditions and the 1269 * following disclaimer in the documentation and/or other 1270 * materials provided with the distribution. 1271 * 1272 * o Neither the name of Internet Society, IETF or IETF Trust, 1273 * nor the names of specific contributors, may be used to 1274 * endorse or promote products derived from this software 1275 * without specific prior written permission. 1276 * 1277 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND 1278 * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, 1279 * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 1280 * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 1281 * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR 1282 * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 1283 * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT 1284 * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 1285 * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 1286 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 1287 * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR 1288 * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 1289 * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 1290 */ 1291 #include "stdio.h" 1292 #include "assert.h" 1293 /* 1294 * program to verify the NPTv6 algorithm 1295 * 1296 * argument: 1297 * perform negative zero suppression: boolean 1298 * 1299 * method: 1300 * We specify an internal and an external prefix. The prefix 1301 * length is presumed to be the common length of both, and for 1302 * this is a /48. We perform the three algorithms specified. 1303 * the "packet" address is in effect the source address 1304 * internal->external and the destination address 1305 * external->internal. 1306 */ 1307 unsigned short inner_init[] = { 1308 0xFD01, 0x0203, 0x0405, 1, 2, 3, 4, 5}; 1309 unsigned short outer_init[] = { 1310 0x2001, 0x0db8, 0x0001, 1, 2, 3, 4, 5}; 1311 unsigned short inner[8]; 1312 unsigned short packet[8]; 1313 unsigned char checksum[65536] = {0}; 1314 unsigned short outer[8]; 1315 unsigned short adjustment; 1316 unsigned short suppress; 1317 /* 1318 * One's complement sum. 1319 * return number1 + number2 1320 */ 1321 unsigned short 1322 add1(number1, number2) 1323 unsigned short number1; 1324 unsigned short number2; 1325 { 1326 unsigned int result; 1328 result = number1; 1329 result += number2; 1330 if (suppress) { 1331 while (0xFFFF <= result) { 1332 result = result + 1 - 0x10000; 1333 } 1334 } else { 1335 while (0xFFFF < result) { 1336 result = result + 1 - 0x10000; 1338 } 1339 } 1340 return result; 1341 } 1343 /* 1344 * One's complement difference 1345 * return number1 - number2 1346 */ 1347 unsigned short 1348 sub1(number1, number2) 1349 unsigned short number1; 1350 unsigned short number2; 1351 { 1352 return add1(number1, ~number2); 1353 } 1355 /* 1356 * return one's complement sum of an array of numbers 1357 */ 1358 unsigned short 1359 sum1(numbers, count) 1360 unsigned short *numbers; 1361 int count; 1362 { 1363 unsigned int result; 1365 result = *numbers++; 1366 while (--count > 0) { 1367 result += *numbers++; 1368 } 1370 if (suppress) { 1371 while (0xFFFF <= result) { 1372 result = result + 1 - 0x10000; 1373 } 1374 } else { 1375 while (0xFFFF < result) { 1376 result = result + 1 - 0x10000; 1377 } 1378 } 1379 return result; 1380 } 1382 /* 1383 * NPTv6 initialization: section 3.1 assuming section 3.4 1384 * 1385 * create the /48, a source address in internal format, and a 1386 * source address in external format. calculate the adjustment 1387 * if one /48 is overwritten with the other. 1388 */ 1389 void 1390 nptv6_initialization(subnet) 1391 unsigned short subnet; 1392 { 1393 int i; 1394 unsigned short inner48; 1395 unsigned short outer48; 1397 /* initialize the internal and external prefixes. */ 1398 for (i = 0; i < 8; i++) { 1399 inner[i] = inner_init[i]; 1400 outer[i] = outer_init[i]; 1401 } 1402 inner[3] = subnet; 1403 outer[3] = subnet; 1404 /* calculate the checksum adjustment */ 1405 inner48 = sum1(inner, 3); 1406 outer48 = sum1(outer, 3); 1407 adjustment = sub1(inner48, outer48); 1408 } 1410 /* 1411 * NPTv6 packet from edge to transit: section 3.2 assuming 1412 * section 3.4 1413 * 1414 * overwrite the prefix in the source address with the outer 1415 * prefix, and adjust the checksum 1416 */ 1417 void 1418 nptv6_inner_to_outer() 1419 { 1420 int i; 1422 /* let's get the source address into the packet */ 1423 for (i = 0; i < 8; i++) { 1424 packet[i] = inner[i]; 1425 } 1427 /* overwrite the prefix with the outer prefix */ 1428 for (i = 0; i < 3; i++) { 1429 packet[i] = outer[i]; 1430 } 1432 /* adjust the checksum */ 1433 packet[3] = add1(packet[3], adjustment); 1435 } 1437 /* 1438 * NPTv6 packet from transit to edge:: section 3.3 assuming 1439 * section 3.4 1440 * 1441 * overwrite the prefix in the destination address with the 1442 * inner prefix, and adjust the checksum 1443 */ 1444 void 1445 nptv6_outer_to_inner() 1446 { 1447 int i; 1449 /* overwrite the prefix with the outer prefix */ 1450 for (i = 0; i < 3; i++) { 1451 packet[i] = inner[i]; 1452 } 1454 /* adjust the checksum */ 1455 packet[3] = sub1(packet[3], adjustment); 1456 } 1458 /* 1459 * main program 1460 */ 1461 main(argc, argv) 1462 int argc; 1463 char **argv; 1464 { 1465 unsigned subnet; 1466 int i; 1468 if (argc < 2) { 1469 fprintf(stderr, "usage: nptv6 supression\n"); 1470 assert(0); 1471 } 1472 suppress = atoi(argv[1]); 1473 assert(suppress <= 1); 1475 for (subnet = 0; subnet < 0x10000; subnet++) { 1476 /* section 3.1: initialize the system */ 1477 nptv6_initialization(subnet); 1479 /* section 3.2: take a packet from inside to outside */ 1480 nptv6_inner_to_outer(); 1482 /* the resulting checksum value should be unique */ 1483 if (checksum[subnet]) { 1484 printf("inner->outer duplicated checksum: " 1485 "inner: %x:%x:%x:%x:%x:%x:%x:%x(%x) " 1486 "calculated: %x:%x:%x:%x:%x:%x:%x:%x(%x)\n", 1487 inner[0], inner[1], inner[2], inner[3], 1488 inner[4], inner[5], inner[6], inner[7], 1489 sum1(inner, 8), 1490 packet[0], packet[1], packet[2], packet[3], 1491 packet[4], packet[5], packet[6], packet[7], 1492 sum1(packet, 8)); 1493 } 1495 checksum[subnet] = 1; 1497 /* 1498 * the resulting checksum should be the same as the inner 1499 * address's checksum 1500 */ 1501 if (sum1(packet, 8) != sum1(inner, 8)) { 1502 printf("inner->outer incorrect: " 1503 "inner: %x:%x:%x:%x:%x:%x:%x:%x(%x) " 1504 "calculated: %x:%x:%x:%x:%x:%x:%x:%x(%x)\n", 1505 inner[0], inner[1], inner[2], inner[3], 1506 inner[4], inner[5], inner[6], inner[7], 1507 sum1(inner, 8), 1508 packet[0], packet[1], packet[2], packet[3], 1509 packet[4], packet[5], packet[6], packet[7], 1510 sum1(packet, 8)); 1511 } 1513 /* section 3.3: take a packet from outside to inside */ 1514 nptv6_outer_to_inner(); 1516 /* 1517 * the returning packet should have the same checksum it 1518 * left with 1519 */ 1520 if (sum1(packet, 8) != sum1(inner, 8)) { 1521 printf("outer->inner checksum incorrect: " 1522 "calculated: %x:%x:%x:%x:%x:%x:%x:%x(%x) " 1523 "inner: %x:%x:%x:%x:%x:%x:%x:%x(%x)\n", 1524 packet[0], packet[1], packet[2], packet[3], 1525 packet[4], packet[5], packet[6], packet[7], 1526 sum1(packet, 8), inner[0], inner[1], inner[2], 1527 inner[3], inner[4], inner[5], inner[6], 1528 inner[7], sum1(inner, 8)); 1529 } 1530 /* 1531 * and every octet should calculate back to the same inner 1532 * value 1533 */ 1534 for (i = 0; i < 8; i++) { 1535 if (inner[i] != packet[i]) { 1536 printf("outer->inner different: " 1537 "calculated: %x:%x:%x:%x:%x:%x:%x:%x " 1538 "inner: %x:%x:%x:%x:%x:%x:%x:%x\n", 1539 packet[0], packet[1], packet[2], packet[3], 1540 packet[4], packet[5], packet[6], packet[7], 1541 inner[0], inner[1], inner[2], inner[3], 1542 inner[4], inner[5], inner[6], inner[7]); 1543 break; 1544 } 1545 } 1546 } 1547 } 1549 Authors' Addresses 1551 Margaret Wasserman 1552 Painless Security 1553 North Andover, MA 01845 1554 USA 1556 Phone: +1 781 405 7464 1557 Email: mrw@painless-security.com 1558 URI: http://www.painless-secuirty.com 1560 Fred Baker 1561 Cisco Systems 1562 Santa Barbara, California 93117 1563 USA 1565 Phone: +1-408-526-4257 1566 Email: fred@cisco.com