idnits 2.17.1 draft-ietf-softwire-dual-stack-lite-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Sep 2009 rather than the newer Notice from 28 Dec 2009. (See https://trustee.ietf.org/license-info/) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 35 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 8 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. == There are 6 instances of lines with non-RFC3849-compliant IPv6 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHOULD not' in this paragraph: Dual-stack lite implementations SHOULD not interfere with the functioning of IPv4 or IPv6 VPNs. -- The document date (March 8, 2010) is 5155 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'UPnP-IGD' is defined on line 1491, but no explicit reference was found in the text == Outdated reference: A later version (-10) exists of draft-ietf-softwire-ds-lite-tunnel-option-02 == Outdated reference: A later version (-07) exists of draft-cheshire-nat-pmp-03 == Outdated reference: A later version (-05) exists of draft-nishitani-cgn-03 == Outdated reference: A later version (-10) exists of draft-ymbk-aplusp-05 Summary: 1 error (**), 0 flaws (~~), 10 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force A. Durand, Ed. 3 Internet-Draft Comcast 4 Intended status: Standards Track March 8, 2010 5 Expires: September 9, 2010 7 Dual-Stack Lite Broadband Deployments Following IPv4 Exhaustion 8 draft-ietf-softwire-dual-stack-lite-04 10 Abstract 12 This document revisits the dual-stack model and introduces the dual- 13 stack lite technology aimed at better aligning the costs and benefits 14 of deploying IPv6. Dual-stack lite enables a broadband service 15 provider to share IPv4 addresses among customers by combining two 16 well-known technologies: IP in IP (IPv4-in-IPv6) and Network Address 17 Translation (NAT). 19 Status of this Memo 21 This Internet-Draft is submitted to IETF in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF), its areas, and its working groups. Note that 26 other groups may also distribute working documents as Internet- 27 Drafts. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 The list of current Internet-Drafts can be accessed at 35 http://www.ietf.org/ietf/1id-abstracts.txt. 37 The list of Internet-Draft Shadow Directories can be accessed at 38 http://www.ietf.org/shadow.html. 40 This Internet-Draft will expire on September 9, 2010. 42 Copyright Notice 44 Copyright (c) 2010 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (http://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with respect 52 to this document. Code Components extracted from this document must 53 include Simplified BSD License text as described in Section 4.e of 54 the Trust Legal Provisions and are provided without warranty as 55 described in the BSD License. 57 Table of Contents 59 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 60 2. Requirements language . . . . . . . . . . . . . . . . . . . . 4 61 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 62 4. Deployment scenarios . . . . . . . . . . . . . . . . . . . . . 5 63 4.1. Access model . . . . . . . . . . . . . . . . . . . . . . . 5 64 4.2. Home gateway . . . . . . . . . . . . . . . . . . . . . . . 5 65 4.3. Directly connected device . . . . . . . . . . . . . . . . 6 66 5. B4 element . . . . . . . . . . . . . . . . . . . . . . . . . . 7 67 5.1. Definition . . . . . . . . . . . . . . . . . . . . . . . . 7 68 5.2. Encapsulation . . . . . . . . . . . . . . . . . . . . . . 7 69 5.3. Fragmentation and Reassembly . . . . . . . . . . . . . . . 7 70 5.4. AFTR discovery . . . . . . . . . . . . . . . . . . . . . . 7 71 5.5. DNS . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 72 5.6. Interface initialization . . . . . . . . . . . . . . . . . 8 73 5.7. Well-known IPv4 address . . . . . . . . . . . . . . . . . 8 74 6. AFTR element . . . . . . . . . . . . . . . . . . . . . . . . . 8 75 6.1. Definition . . . . . . . . . . . . . . . . . . . . . . . . 8 76 6.2. Encapsulation . . . . . . . . . . . . . . . . . . . . . . 8 77 6.3. Fragmentation and Reassembly . . . . . . . . . . . . . . . 9 78 6.4. DNS . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 79 6.5. Well-known IPv4 address . . . . . . . . . . . . . . . . . 9 80 6.6. Extended binding table . . . . . . . . . . . . . . . . . . 10 81 7. Network Considerations . . . . . . . . . . . . . . . . . . . . 10 82 7.1. Tunneling . . . . . . . . . . . . . . . . . . . . . . . . 10 83 7.2. VPN . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 84 7.3. Multicast considerations . . . . . . . . . . . . . . . . . 10 85 8. NAT considerations . . . . . . . . . . . . . . . . . . . . . . 10 86 8.1. NAT pool . . . . . . . . . . . . . . . . . . . . . . . . . 10 87 8.2. NAT conformance . . . . . . . . . . . . . . . . . . . . . 10 88 8.3. Application Level Gateways (ALG) . . . . . . . . . . . . . 11 89 8.4. Port allocation . . . . . . . . . . . . . . . . . . . . . 11 90 8.4.1. How many ports per customers? . . . . . . . . . . . . 11 91 8.4.2. Dynamic port assignment considerations . . . . . . . . 12 92 8.4.3. Subscriber controlled port assignment . . . . . . . . 12 93 8.5. Other considerations about sharing global IPv4 94 addresses . . . . . . . . . . . . . . . . . . . . . . . . 12 95 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 12 96 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 97 11. Security Considerations . . . . . . . . . . . . . . . . . . . 13 98 12. Author's Addresses . . . . . . . . . . . . . . . . . . . . . . 14 99 13. Appendix A: future DS-Lite extensions . . . . . . . . . . . . 15 100 13.1. Static port reservation . . . . . . . . . . . . . . . . . 15 101 13.1.1. Port forwarding model . . . . . . . . . . . . . . . . 16 102 13.1.2. A+P model . . . . . . . . . . . . . . . . . . . . . . 16 103 13.2. Dynamic port reservation . . . . . . . . . . . . . . . . . 16 104 13.2.1. UPnP . . . . . . . . . . . . . . . . . . . . . . . . . 16 105 13.2.2. NAT-PMP . . . . . . . . . . . . . . . . . . . . . . . 17 106 13.2.3. DHCPv6 . . . . . . . . . . . . . . . . . . . . . . . . 17 107 14. Appendix B: Examples . . . . . . . . . . . . . . . . . . . . . 17 108 14.1. Gateway based architecture . . . . . . . . . . . . . . . . 17 109 14.1.1. Example message flow . . . . . . . . . . . . . . . . . 20 110 14.1.2. Translation details . . . . . . . . . . . . . . . . . 24 111 14.2. Host based architecture . . . . . . . . . . . . . . . . . 25 112 14.2.1. Example message flow . . . . . . . . . . . . . . . . . 28 113 14.2.2. Translation details . . . . . . . . . . . . . . . . . 32 114 15. Appendix C: Deployment considerations . . . . . . . . . . . . 32 115 15.1. AFTR service distribution and horizontal scaling . . . . . 32 116 15.2. Horizontal scaling . . . . . . . . . . . . . . . . . . . . 33 117 15.3. High availability . . . . . . . . . . . . . . . . . . . . 33 118 15.4. Logging . . . . . . . . . . . . . . . . . . . . . . . . . 33 119 16. References . . . . . . . . . . . . . . . . . . . . . . . . . . 34 120 16.1. Normative references . . . . . . . . . . . . . . . . . . . 34 121 16.2. Informative references . . . . . . . . . . . . . . . . . . 34 122 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 36 124 1. Introduction 126 The common thinking for more than 10 years has been that the 127 transition to IPv6 will be based on the dual stack model and that 128 most things would be converted this way before we ran out of IPv4. 130 However, this has not happened. The IANA free pool of IPv4 addresses 131 will be depleted soon, well before any significant IPv6 deployment 132 will have occurred. 134 This document revisits the dual-stack model and introduces the dual- 135 stack lite technology aimed at better aligning the costs and benefits 136 of deploying IPv6. Dual-stack lite will provide the necessary bridge 137 between the two protocols, offering an evolution path of the Internet 138 post IANA IPv4 depletion. 140 Dual-stack lite enables a broadband service provider to share IPv4 141 addresses among customers by combining two well-known technologies: 142 IP in IP (IPv4-in-IPv6) and NAT. 144 This document makes a distinction between a dual-stack capable and a 145 dual-stack provisioned device. The former is a device that has code 146 that implements both IPv4 and IPv6, from the network layer to the 147 applications. The latter is a similar device that has been 148 provisioned with both an IPv4 and an IPv6 address on its 149 interface(s). This document will also further refine this notion by 150 distinguishing between interfaces provisioned directly by the service 151 provider from those provisioned by the customer. 153 Pure IPv6-only devices (i.e. devices that do not include an IPv4 154 stack) are outside of the scope of this document. 156 2. Requirements language 158 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 159 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 160 document are to be interpreted as described in RFC 2119 [RFC2119]. 162 3. Terminology 164 The technology described in this document is known as dual-stack 165 lite. The abbreviation DS-Lite will be used along this text. 167 This document also introduces two new terms: the DS-Lite Basic 168 Bridging BroadBand element (B4) and the DS-Lite Address Family 169 Transition Router element (AFTR). 171 4. Deployment scenarios 173 4.1. Access model 175 Instead of relying on a cascade of NATs, the dual-stack lite model is 176 built on IPv4-in-IPv6 tunnels to cross the network to reach a 177 carrier-grade IPv4-IPv4 NAT (the AFTR) where customers will share 178 IPv4 addresses. There are numbers of benefits to this approach: 180 o This technology decouples the deployment of IPv6 in the service 181 provider network (up to the customer premise equipment or CPE) 182 from the deployment of IPv6 in the global Internet and in customer 183 applications & devices. 185 o The management of the service provider access networks is 186 simplified by leveraging the large IPv6 address space. 187 Overlapping private IPv4 address spaces are not required to 188 support very large customer bases. 190 o As tunnels can terminate anywhere in the service provider network, 191 this architecture leads itself to horizontal scaling and provides 192 great flexibility to adapt to changing traffic load. 194 o Tunnels provide a direct connection between B4 and the AFTR. This 195 can be leveraged to enable customers and their applications to 196 control how the NAT function of the AFTR is performed. 198 A key characteristic of this approach is that communications between 199 end-nodes stay within their address family. IPv6 sources only 200 communicate with IPv6 destinations, IPv4 sources only communicate 201 with IPv4 destinations. There is no protocol family translation 202 involved in this approach. This simplifies greatly the task of 203 applications that may carry literal IP addresses in their payload. 205 4.2. Home gateway 207 This section describes home Local Area networks characterized by the 208 presence of a home gateway provisioned only with IPv6 by the service 209 provider. 211 A DS-Lite home gateway is an IPv6 aware home gateway with a B4 212 Interface implemented in the WAN interface. 214 A DS-Lite home gateway SHOULD NOT operate a NAT function on a B4 215 interface, as the NAT function will be performed by the AFTR in the 216 service provider's network. That will avoid accidentally operating 217 in a double NAT environment. 219 However, it SHOULD operate its own DHCP(v4) server handing out 220 [RFC1918] address space (e.g. 192.168.0.0/16) to hosts in the home. 221 It SHOULD advertise itself as the default IPv4 router to those home 222 hosts. It SHOULD also advertise itself as a DNS server in the DHCP 223 Option 6 (DNS Server). Additionally, it SHOULD operate a DNS proxy 224 to accept DNS IPv4 requests from home hosts and send them using IPv6 225 to the service provider DNS servers, as described in Section 5.5. 227 Note: if an IPv4 home host decides to use another IPv4 DNS server, 228 the DS-Lite home gateway will forward those DNS requests via the B4 229 interface, the same way it forwards any regular IPv4 packets. 231 IPv6 capable devices directly reach the IPv6 Internet. Packets 232 simply follow IPv6 routing, they do not go through the tunnel, and 233 are not subject to any translation. It is expected that most IPv6 234 capable devices will also be IPv4 capable and will simply be 235 configured with an IPv4 RFC1918 style address within the home network 236 and access the IPv4 Internet the same way as the legacy IPv4-only 237 devices within the home. 239 Pure IPv6-only devices (i.e. devices that do not include an IPv4 240 stack) are outside of the scope of this document. 242 4.3. Directly connected device 244 In broadband home networks, sometime devices are directly connected 245 to the broadband service provider. They are connected straight to a 246 modem, without a home gateway. 248 Under this scenario, the customer device is a dual-stack capable host 249 that is only provisioned by the service provider only with IPv6. The 250 device itself acts as a B4 element and the IPv4 service is provided 251 by an IPv4-in-IPv6 tunnel, just as in the home gateway case. That 252 device can run any combinations of IPv4 and/or IPv6 applications. 254 A directly connected DS-Lite device SHOULD send its DNS requests over 255 IPv6 to the IPv6 DNS server it has been configured to use. 257 Similarly to the previous sections, IPv6 packets follow IPv6 routing, 258 they do not go through the tunnel, and are not subject to any 259 translation. 261 The support of IPv4-only devices and IPv6-only devices in this 262 scenario is out of scope for this document. 264 5. B4 element 266 5.1. Definition 268 The B4 element is a function implemented on a dual-stack capable 269 node, either a directly connected device or a home gateway, that 270 creates a tunnel to an AFTR. 272 5.2. Encapsulation 274 The tunnel is a multi-point to point IPv4-in-IPv6 tunnel ending on a 275 service provider AFTR. 277 See section 7.1 for additional tunneling considerations. 279 Note: at this point, DS-Lite only defines IPv4-in-IPv6 tunnels, 280 however other types of encapsulation could be defined in the future. 282 5.3. Fragmentation and Reassembly 284 Using an encapsulation (IPv4-in-IPv6 or anything else) to carry IPv4 285 traffic over IPv6 will reduce the effective MTU of the datagram. 286 Unfortunately, path MTU discovery [RFC1191] is not a reliable method 287 to deal with this problem. 289 A solution to deal with this problem is for the service provider to 290 increase the MTU size of all the links between the B4 element and the 291 AFTR elements by at least 40 bytes to accommodate both the IPv6 292 encapsulation header and the IPv4 datagram without fragmenting the 293 IPv6 packet. 295 However, as not all service providers will be able to increase their 296 link MTU, the B4 element MUST perform fragmentation and reassembly if 297 the outgoing link MTU cannot accommodate for the extra IPv6 header. 298 Fragmentation MUST happen after the encapsulation on the IPv6 packet. 299 Reassembly MUST happen before the decapsulation of the IPv6 header. 300 Detailed procedure has been specified in [RFC2473] Section 7.2. 302 5.4. AFTR discovery 304 In order to configure the IPv4-in-IPv6 tunnel, the B4 element needs 305 the IPv6 address of the AFTR element. This IPv6 address can be 306 configured using a variety of methods, ranging from an out-of-band 307 mechanism, manual configuration or a variety of DHCPv6 options. 309 In order to guarantee interoperability, a B4 element SHOULD implement 310 the DHCPv6 option defined in 311 [I-D.ietf-softwire-ds-lite-tunnel-option]. 313 5.5. DNS 315 A B4 element is only configured from the service provider with IPv6. 316 As such, it can only learn the address of a DNS recursive server 317 through DHCPv6 (or other similar method over IPv6). As DHCPv6 only 318 defines an option to get the IPv6 address of such a DNS recursive 319 server, the B4 element cannot easily discover the IPv4 address of 320 such a recursive DNS server, and as such will have to perform all DNS 321 resolution over IPv6. 323 The B4 element can pass this IPv6 address to downstream IPv6 nodes, 324 but not to downstream IPv4 nodes. As such, the B4 element SHOULD 325 implement a DNS proxy, following the recommendations of [RFC5625]. 327 5.6. Interface initialization 329 Initialization of the interface including a B4 element is out-of- 330 scope in this specification. 332 5.7. Well-known IPv4 address 334 Any locally unique IPv4 address could be configured on the IPv4-in- 335 IPv6 tunnel to represent the B4 element. Configuring such an address 336 is often necessary when the B4 element is sourcing IPv4 datagrams 337 directly over the tunnel. In order to avoid conflicts with any other 338 address, IANA has defined a well-known range, 192.0.0.0/29. 340 192.0.0.0 is the reserved subnet address. 192.0.0.1 is reserved for 341 the AFTR element. The B4 element MAY use any other addresses within 342 the 192.0.0.0/29 range. 344 Note: a range of addresses has been reserved for this purpose. The 345 intent is to accommodate nodes implementing multiple B4 elements. 347 6. AFTR element 349 6.1. Definition 351 An AFTR element is the combination of an IPv4-in-IPv6 tunnel end- 352 point and an IPv4-IPv4 NAT implemented on the same node. 354 6.2. Encapsulation 356 The tunnel is a point-to-multipoint IPv4-in-IPv6 tunnel ending at the 357 B4 elements. 359 See section 7.1 for additional tunneling considerations. 361 Note: at this point, DS-Lite only defines IPv4-in-IPv6 tunnels, 362 however other types of encapsulation could be defined in the future. 364 6.3. Fragmentation and Reassembly 366 As noted previously, fragmentation and reassembly need to be taken 367 care of by the tunnel end-points. As such, the AFTR MUST perform 368 fragmentation and reassembly if the underlying link MTU cannot 369 accommodate the extra IPv6 header of the tunnel. Fragmentation MUST 370 happen after the encapsulation on the IPv6 packet. Reassembly MUST 371 happen before the decapsulation of the IPv6 header. Detailed 372 procedure has been specified in [RFC2473] Section 7.2. 374 Fragmentation at the Tunnel Entry-Point is a light-weight operation. 375 In contrast, reassembly at the Tunnel Exit-Point can be expensive. 376 When the Tunnel Exit-Point receives the first fragmented packet, it 377 must wait for the second fragmented packet to arrive in order to 378 reassemble the two fragmented IPv6 packets for decapsulation. This 379 requires the Tunnel Exit-Point to buffer and keep track of fragmented 380 packets. Consider that the AFTR is the Tunnel Exit-Point for many 381 tunnels. If many clients simultaneously source large number of 382 fragmented packets to the AFTR, this will require the AFTR to buffer 383 and consume enormous resources to keep track of the flows. This 384 reassembly process will significantly impact the AFTR performance. 385 However, this impact only happens when many clients simultaneously 386 source large IPv4 packets. Since we believe that majority of the 387 clients will receive large IPv4 packets (such as watching video 388 streams) instead of sourcing large IPv4 packets (such as sourcing 389 video streams), so reassembly is only a fraction of the overall 390 AFTR's workload. 392 Methods to avoid fragmentation, such as rewriting the TCP MSS option 393 or using technologies such as Subnetwork Encapsulation and Adaptation 394 Layer defined in [I-D.templin-seal] are out of scope for this 395 document. 397 6.4. DNS 399 As noted previously, DS-Lite node implementing a B4 elements will 400 perform DNS resolution over IPv6. As such, very few, if any, DNS 401 packets will flow through the AFTR element. 403 6.5. Well-known IPv4 address 405 The AFTR MAY use the well-known IPv4 address 192.0.0.1 reserved by 406 IANA to configure the IPv4-in-IPv6 tunnel. That address can then be 407 used to report ICMP problems and will appear in traceroute outputs. 409 6.6. Extended binding table 411 The NAT binding table of the AFTR element is extended to include the 412 source IPv6 address of the incoming packets. This IPv6 address is 413 used to disambiguate between the overlapping IPv4 address space of 414 the service provider customers. 416 By doing a reverse look-up in the extended IPv4 NAT binding table, 417 the AFTR knows how to reconstruct the IPv6 encapsulation when the 418 packets comes back from the Internet. That way, there is no need to 419 keep a static configuration for each tunnel. 421 7. Network Considerations 423 7.1. Tunneling 425 Tunneling MUST be done in accordance to [RFC2473] and [RFC4213]. 426 Traffic classes ([RFC2474]) from the IPv4 headers SHOULD be carried 427 over to the IPv6 headers and vice versa. 429 7.2. VPN 431 Dual-stack lite implementations SHOULD not interfere with the 432 functioning of IPv4 or IPv6 VPNs. 434 7.3. Multicast considerations 436 Multicast is out-of-scope in this document. 438 8. NAT considerations 440 8.1. NAT pool 442 AFTRs MAY operate distinct, non overlapping NAT pools. Those NAT 443 pools do not have to be continuous. 445 8.2. NAT conformance 447 A dual-stack lite AFTR SHOULD implement behavior conforming to the 448 best current practice, currently documented in [RFC4787], [RFC5382] 449 and [RFC5508]. Other requirements for AFTRs can be found in 450 [I-D.nishitani-cgn]. 452 8.3. Application Level Gateways (ALG) 454 The AFTR should only perform a minimum number of ALG for the classic 455 applications such as FTP, RTSP/RTP, IPsec and PPTP VPN pass-through 456 and enable the users to use their own ALG on statically or 457 dynamically reserved ports instead. 459 8.4. Port allocation 461 8.4.1. How many ports per customers? 463 Because IPv4 addresses will be shared among customers and potentially 464 a large address space reduction factor may be applied, in average, 465 only a limited number N of TCP or UDP port numbers will be available 466 per customer. This means that applications opening a very large 467 number of TCP ports may have a harder time to work. For example, it 468 has been reported that a very well know web site was using AJAX 469 techniques and was opening up to 69 TCP ports per web page. If we 470 make the hypothesis of an address space reduction of a factor 100 471 (one IPv4 address per 100 customers), and 65k ports per IPv4 472 addresses available, that makes an average of N = 650 ports available 473 simultaneously to be shared among the various devices behind the 474 dual-stack lite tunnel end-point. 476 There is an important operational difference if those N ports are 477 pre-allocated in a cookie-cutter fashion versus allocated on demand 478 by incoming connections. This is a difference between an average of 479 N ports and a maximum of N ports. Several service providers have 480 reported an average number of connections per customer in the single 481 digits. At the opposite end, thousands or tens of thousands of ports 482 could be use in a peak by any single customer browsing a number of 483 AJAX/Web 2.0 sites. 485 As such, service providers allocating a fixed number of ports per 486 user should dimension the system with a minimum of N = several 487 thousands of ports for every user. This would bring the address 488 space reduction ratio to a single digit. Service providers using a 489 smaller number of ports per user (N in the hundreds) should expect 490 customers applications to break in a more or less random way over 491 time. 493 In order to achieve higher address space reduction ratios, it is 494 recommended that service provider do not use this cookie-cutter 495 approach, and, on the contrary, allocate ports as dynamically as 496 possible, just like on a regular NAT. With an average number of 497 connections per customers in the single digit, having an address 498 space reduction of a factor 100 is realistic. However, service 499 providers should exercise caution and make sure their pool of port 500 numbers does not go too low. The actual maximum address space 501 reduction factor is unknown at this time. 503 8.4.2. Dynamic port assignment considerations 505 When dynamic port assignment is used to maximize the number of 506 subscribers sharing the AFTR global IPv4 addresses, the AFTR should 507 implement checks to avoid DOS attack through exhaustion of available 508 ports. It should also avoid mapping any one subscriber's "flows" 509 across more than one global IPv4 address. 511 8.4.3. Subscriber controlled port assignment 513 Dynamic port assignment precludes inbound access to subscriber 514 servers, just as in a home gateway NAT. Inbound access to subscriber 515 servers can be provided through pre-assigned and/or reserved port 516 mappings in the AFTR. Specifying the mechanisms for managing and 517 signaling these reserved port mappings is out of scope for this 518 document, however some techniques are mentioned in appendix A as 519 examples. 521 8.5. Other considerations about sharing global IPv4 addresses 523 More considerations on sharing the port space of IPv4 addresses can 524 be found in [I-D.ford-shared-addressing-issues]. 526 9. Acknowledgements 528 The authors would like to acknowledge the role of Mark Townsley for 529 his input on the overall architecture of this technology by pointing 530 this work in the direction of [I-D.droms-softwires-snat]. Note that 531 this document results from a merging of [I-D.durand-dual-stack-lite] 532 and [I-D.droms-softwires-snat].Also to be acknowledged are the many 533 discussions with a number of people including Shin Miyakawa, 534 Katsuyasu Toyama, Akihide Hiura, Takashi Uematsu, Tetsutaro Hara, 535 Yasunori Matsubayashi, Ichiro Mizukoshi. The author would also like 536 to thank David Ward, Jari Arkko, Thomas Narten and Geoff Huston for 537 their constructive feedback. Special thanks go to Dave Thaler and 538 Dan Wing for their reviews and comments. 540 10. IANA Considerations 542 This draft request IANA to allocate a well know IPv4 192.0.0.0/29 543 network prefix. That range is used to number the dual-stack lite 544 interfaces. Reserving a /29 allows for 6 possible interfaces on a 545 multi-home node. The IPv4 address 192.0.0.1 is reserved as the IPv4 546 address of the default router for such dual-stack lite hosts. 548 11. Security Considerations 550 Security issues associated with NAT have long been documented. See 551 [RFC2663] and [RFC2993]. 553 However, moving the NAT functionality from the home gateway to the 554 core of the service provider network and sharing IPv4 addresses among 555 customers create additional requirements when logging data for abuse 556 usage. With any architecture where an IPv4 address does not uniquely 557 represent an end host, IPv4 addresses and a timestamps are no longer 558 sufficient to identify a particular broadband customer. Additional 559 information such as transport protocol information will be required 560 for that purpose. For example, we suggest to log the transport port 561 number for TCP and UDP connections. 563 The AFTR performs translation functions for interior IPv4 hosts at 564 RFC 1918 addresses or at the IANA reserved address range (TBA by 565 IANA). If the interior host is properly using the authorized IPv4 566 address with the authorized transport protocol port range such as A+P 567 semantic for the tunnel, the AFTR can simply forward without 568 translation to permit the authorized address and port range to 569 function properly. All packets with unauthorized interior IPv4 570 addresses or with authorized interior IPv4 address but unauthorized 571 port range MUST NOT be forwarded by the AFTR. This prevents rogue 572 devices from launching denial of service attacks using unauthorized 573 public IPv4 addresses in the IPv4 source header field or unauthorized 574 transport port range in the IPv4 transport header field. For 575 example, rogue devices could bombard a public web server by launching 576 TCP SYN ACK attack. The victim will receive TCP SYN from random IPv4 577 source addresses at a rapid rate and deny TCP services to legitimate 578 users. 580 With IPv4 addresses shared by multiple users, ports become a critical 581 resource. As such, some mechanisms need to be put in place by an 582 AFTR to limit port usage, either by rate-limiting new connections or 583 putting a hard limit on the maximum number of port usable by single 584 user. If this number is high enough, it should not interfere with 585 normal usage and still provide reasonable protection of the shared 586 pool. More considerations on ports allocation and port exhaustion 587 can be found in section 8.4. 589 More considerations on sharing IPv4 addresses can be found in 590 "I-D.ford-shared-addressing-issues". 592 AFTRs should support ways to limit service to registered customers. 594 If strict IPv6 ingress filtering is deployed in the broadband network 595 to prevent IPv6 address spoofing and dual-stack lite service is 596 restricted to those customers, then tunnels terminating at the AFTR 597 and coming from registered customer IPv6 addresses cannot be spoofed. 598 Thus a simple access control list on the tunnel transport source 599 address is all that is required to accept traffic on the southbound 600 interface of an AFTR. 602 If IPv6 address spoofing prevention is not in place, the AFTR should 603 perform further sanity checks on the IPv6 address of incoming IPv6 604 packets. For example, it should check if the address has really been 605 allocated to an authorized customer. 607 12. Author's Addresses 609 This document is the result of the work of the following authors: 611 Alain Durand 612 Comcast 613 1, Comcast center 614 Philadelphia, PA 19103 615 USA 616 Email: alain_durand@cable.comcast.com 618 Ralph Droms 619 Cisco 620 1414 Massachusetts Avenue 621 Boxborough, MA 01714 622 USA 623 Phone: +1 978.936.1674 624 Email: rdroms@cisco.com 626 Brian Haberman 627 Johns Hopkins University Applied Physics Lab 628 11100 Johns Hopkins Road 629 Laurel, MD 20723-6099 630 USA 631 Phone: +1 443 778 1319 632 Email: brian@innovationslab.net 633 James Woodyatt 634 Apple Inc. 635 1 Infinite Loop 636 Cupertino, CA 95014 637 USA 638 Email: jhw@apple.com 640 Yiu Lee 641 Comcast 642 1, Comcast center 643 Philadelphia, PA 19103 644 USA 645 Email: yiu_lee@cable.comcast.com 647 Randy Bush 648 Internet Initiative Japan 649 5147 Crystal Springs 650 Bainbridge Island, Washington 98110 651 USA 652 Phone: +1 206 780 0431 x1 653 Email: randy@psg.com 655 13. Appendix A: future DS-Lite extensions 657 Techniques discussed below are not part of the core dual-stack lite 658 specification and will be developed in separate documents. They are 659 only listed here as examples. 661 Applications expecting incoming connections, such a peer-to-peer 662 applications, have become popular. Those applications use a very 663 limited number of ports, usually a single one. Making sure those 664 applications keep working in a dual-stack lite environment is 665 important. Similarly, there is a growing list of applications that 666 require some kind of ALG to work through a NAT. Service provider 667 AFTRs should not prevent the deployment of such applications. As 668 such, there is a legitimate need to leave certain ports under the 669 control of the end user. This argues for a hybrid environment, where 670 most ports are dynamically managed by the AFTR in a shared pool and a 671 limited number are dedicated per users and controlled by them. 673 13.1. Static port reservation 675 A service provider can reserve a number of static ports per user. 676 Note: those could be TCP and/or UDP ports. The simplest model to 677 allow users to control the associated NAT bindings is to offer a web 678 interface (for example as part of the service provider portal) where, 679 once authenticated, a user can configure each dedicated external IPv4 680 address/port binding on the AFTR either using the port forwarding 681 semantic or the A+P semantic. 683 Note: The exact number of ports reserved per user is left at the 684 discretion of the service provider. 686 13.1.1. Port forwarding model 688 In this model, the subscriber directs the AFTR to rewrite the 689 destination address in those incoming packets to a private IPv4 690 address within the home network. For obvious security reasons, 691 redirection to global IPv4 address should not be authorized. Note: 692 this behavior is very similar to the port forwarding function found 693 in most home gateways. 695 13.1.2. A+P model 697 The subscriber directs the AFTR to forward incoming traffic on a 698 given address/port to the dual-stack lite home gateway, and let this 699 device deal with it. This required support for A+P [I-D.ymbk-aplusp] 700 semantic on both the AFTR and on the home gateway. 702 In particular, an A+P aware home router can locally NAT A+P packets 703 to and from internal hosts. Alternatively, it can forward directly 704 the traffic to those hosts if they are configured, for example, with 705 A+P secondary address and ports. 707 An AFTR forwards packets in the A+P range directly to and from the 708 tunnels without NAT. 710 13.2. Dynamic port reservation 712 13.2.1. UPnP 714 A B4 element can act as a UPnP relay, forwarding UPnP messages over 715 the tunnel to the AFTR. This may work in some cases, but not all the 716 time. Some applications insist on running on a well-known port 717 number (or port range) using UPnP to request the NAT to reserve that 718 port. Those ports may or may not be available; they could be used by 719 another customer. Using UPnP, a NAT box does not have any way to 720 redirect such applications to use another port, the only option is to 721 deny the request. Those applications typically then cycle through a 722 small range of ports (typically 10 or so) until they abort. The 723 likelihood of those ports being all already in use by other users is 724 an inverse function of the address space reduction, ie, how many 725 users are sharing the same address. 727 Note: the UPnP forum has been reported to address this issue in an 728 upcoming version of the IGD profile. 730 13.2.2. NAT-PMP 732 NAT-PMP [I-D.cheshire-nat-pmp] offers a better semantic, by enabling 733 the NAT to redirect the application to use another unallocated port. 734 A B4 element could proxy the NAT-PMP messages to the AFTR through the 735 tunnel. 737 13.2.3. DHCPv6 739 If more ports need to be reserved outside of that static dedicated 740 range, a DHCPv6 option such as 741 [I-D.bajko-v6ops-port-restricted-ipaddr-assign] may also be an 742 interesting approach. This may be limited to the A+P semantic 743 mentioned above, as there might not be a way to explicitly control 744 the port forwarding semantic. Also, there are concerns that this 745 would lead to a cookie cutter distribution of ports per customers, 746 dramatically reducing the ratio of customer per IPv4 address. 748 14. Appendix B: Examples 750 14.1. Gateway based architecture 752 This architecture is targeted at residential broadband deployments 753 but can be adapted easily to other types of deployment where the 754 installed base of IPv4-only devices is important. 756 Consider a scenario where a Dual-Stack lite home gateway is 757 provisioned only with IPv6 in the WAN port, no IPv4. The home 758 gateway acts as an IPv4 DCHP server for the LAN network (wireline and 759 wireless) handing out RFC1918 addresses. In addition, the home 760 gateway may support IPv6 Auto-Configuration and/or DHCPv6 server for 761 the LAN network. When an IPv4-only device connects to the home 762 gateway, the gateway will hand it out a RFC1918 address. When a 763 dual-stack capable device connects to the home gateway, the gateway 764 will hand out a RFC1918 address and a global IPv6 address to the 765 device. Besides, the home gateway will create an IPv4-in-IPv6 766 softwire tunnel [RFC5571]to an AFTR that resides in the service 767 provider network. 769 When the device accesses IPv6 service, it will send the IPv6 datagram 770 to the home gateway natively. The home gateway will route the 771 traffic upstream to the default gateway. 773 When the device accesses IPv4 service, it will source the IPv4 774 datagram with the RFC1918 address and send the IPv4 datagram to the 775 home gateway. The home gateway will encapsulate the IPv4 datagram 776 inside the IPv4-in-IPv6 softwire tunnel and forward the IPv6 datagram 777 to the AFTR. This contrasts what the home gateways normally do today 778 which will NAT the RFC1918 address to the public IPv4 address and 779 route the datagram upstream. When the AFTR receives the IPv6 780 datagram, it will decapsulate the IPv6 header and perform an IPv4-to- 781 IPv4 NAT on the source address. 783 As illustrated in Figure 1, this dual-stack lite deployment model 784 consists of three components: the dual-stack lite home router with a 785 B4 element, the AFTR and a softwire between the B4 element acting as 786 softwire initiator (SI) [RFC5571] in the dual-stack lite home router 787 and the softwire concentrator (SC) [RFC5571] in the AFTR. The AFTR 788 performs IPv4-IPv4 NAT translations to multiplex multiple subscribers 789 through a pool of global IPv4 address. Overlapping address spaces 790 used by subscribers are disambiguated through the identification of 791 tunnel endpoints. 793 +-----------+ 794 | Host | 795 +-----+-----+ 796 |10.0.0.1 797 | 798 | 799 |10.0.0.2 800 +---------|---------+ 801 | | | 802 | Home router | 803 |+--------+--------+| 804 || B4 || 805 |+--------+--------+| 806 +--------|||--------+ 807 |||2001:0:0:1::1 808 ||| 809 |||<-IPv4-in-IPv6 softwire 810 ||| 811 -------|||------- 812 / ||| \ 813 | ISP core network | 814 \ ||| / 815 -------|||------- 816 ||| 817 |||2001:0:0:2::1 818 +--------|||--------+ 819 | AFTR | 820 |+--------+--------+| 821 || Concentrator || 822 |+--------+--------+| 823 | |NAT| | 824 | +-+-+ | 825 +---------|---------+ 826 |129.0.0.1 827 | 828 --------|-------- 829 / | \ 830 | Internet | 831 \ | / 832 --------|-------- 833 | 834 |128.0.0.1 835 +-----+-----+ 836 | IPv4 Host | 837 +-----------+ 839 Figure 1: gateway-based architecture 841 Notes: 843 o The dual-stack lite home router is not required to be on the same 844 link as the host 846 o The dual-stack lite home router could be replaced by a dual-stack 847 lite router in the service provider network 849 The resulting solution accepts an IPv4 datagram that is translated 850 into an IPv4-in-IPv6 softwire datagram for transmission across the 851 softwire. At the corresponding endpoint, the IPv4 datagram is 852 decapsulated, and the translated IPv4 address is inserted based on a 853 translation from the softwire. 855 14.1.1. Example message flow 857 In the example shown in Figure 2, the translation tables in the AFTR 858 is configured to forward between IP/TCP (10.0.0.1/10000) and IP/TCP 859 (129.0.0.1/5000). That is, a datagram received by the dual-stack 860 lite home router from the host at address 10.0.0.1, using TCP DST 861 port 10000 will be translated a datagram with IP SRC address 862 129.0.0.1 and TCP SRC port 5000 in the Internet. 864 +-----------+ 865 | Host | 866 +-----+-----+ 867 | |10.0.0.1 868 IPv4 datagram 1 | | 869 | | 870 v |10.0.0.2 871 +---------|---------+ 872 | | | 873 | home router | 874 |+--------+--------+| 875 || B4 || 876 |+--------+--------+| 877 +--------|||--------+ 878 | |||2001:0:0:1::1 879 IPv6 datagram 2| ||| 880 | |||<-IPv4-in-IPv6 softwire 881 | ||| 882 -----|-|||------- 883 / | ||| \ 884 | ISP core network | 885 \ | ||| / 886 -----|-|||------- 887 | ||| 888 | |||2001:0:0:2::1 889 +------|-|||--------+ 890 | | AFTR | 891 | v ||| | 892 |+--------+--------+| 893 || Concentrartor || 894 |+--------+--------+| 895 | |NAT| | 896 | +-+-+ | 897 +---------|---------+ 898 | |129.0.0.1 899 IPv4 datagram 3 | | 900 | | 901 -----|--|-------- 902 / | | \ 903 | Internet | 904 \ | | / 905 -----|--|-------- 906 | | 907 v |128.0.0.1 908 +-----+-----+ 909 | IPv4 Host | 910 +-----------+ 911 Figure 2: Outbound Datagram 913 +-----------------+--------------+---------------+ 914 | Datagram | Header field | Contents | 915 +-----------------+--------------+---------------+ 916 | IPv4 datagram 1 | IPv4 Dst | 128.0.0.1 | 917 | | IPv4 Src | 10.0.0.1 | 918 | | TCP Dst | 80 | 919 | | TCP Src | 10000 | 920 | --------------- | ------------ | ------------- | 921 | IPv6 Datagram 2 | IPv6 Dst | 2001:0:0:2::1 | 922 | | IPv6 Src | 2001:0:0:1::1 | 923 | | IPv4 Dst | 128.0.0.1 | 924 | | IPv4 Src | 10.0.0.1 | 925 | | TCP Dst | 80 | 926 | | TCP Src | 10000 | 927 | --------------- | ------------ | ------------- | 928 | IPv4 datagram 3 | IPv4 Dst | 128.0.0.1 | 929 | | IPv4 Src | 129.0.0.1 | 930 | | TCP Dst | 80 | 931 | | TCP Src | 5000 | 932 +-----------------+--------------+---------------+ 934 Datagram header contents 936 When datagram 1 is received by the dual-stack lite home router, the 937 B4 function encapsulates the datagram in datagram 2 and forwards it 938 to the dual-stack lite carrier-grade NAT over the softwire. 940 When it receives datagram 2, the tunnel concentrator in the AFTR 941 hands the IPv4 datagram to the NAT, which determines from its 942 translation table that the datagram received on Softwire_1 with TCP 943 SRC port 10000 should be translated to datagram 3 with IP SRC address 944 129.0.0.1 and TCP SRC port 5000. 946 Figure 3 shows an inbound message received at the AFTR. When the NAT 947 function in the AFTR receives datagram 1, it looks up the IP/TCP DST 948 in its translation table. In the example in Figure 3, the NAT 949 translates the TCP DST port to 10000, sets the IP DST address to 950 10.0.0.1 and hands the datagram to the SC for transmission over 951 Softwire_1. The B4 in the home router decapsulates IPv4 datagram 952 from the inbound softwire datagram, and forwards it to the host. 954 +-----------+ 955 | Host | 956 +-----+-----+ 957 ^ |10.0.0.1 958 IPv4 datagram 3 | | 959 | | 960 | |10.0.0.2 961 +---------|---------+ 962 | +-+-+ | 963 | home router | 964 |+--------+--------+| 965 || B4 || 966 |+--------+--------+| 967 +--------|||--------+ 968 ^ |||2001:0:0:1::1 969 IPv6 datagram 2 | ||| 970 | |||<-IPv4-in-IPv6 softwire 971 | ||| 972 -----|-|||------- 973 / | ||| \ 974 | ISP core network | 975 \ | ||| / 976 -----|-|||------- 977 | ||| 978 | |||2001:0:0:2::1 979 +------|-|||--------+ 980 | AFTR | 981 |+--------+--------+| 982 || Concentrator || 983 |+--------+--------+| 984 | |NAT| | 985 | +-+-+ | 986 +---------|---------+ 987 ^ |129.0.0.1 988 IPv4 datagram 1 | | 989 | | 990 -----|--|-------- 991 / | | \ 992 | Internet | 993 \ | | / 994 -----|--|-------- 995 | | 996 | |128.0.0.1 997 +-----+-----+ 998 | IPv4 Host | 999 +-----------+ 1001 Figure 3: Inbound Datagram 1003 +-----------------+--------------+---------------+ 1004 | Datagram | Header field | Contents | 1005 +-----------------+--------------+---------------+ 1006 | IPv4 datagram 1 | IPv4 Dst | 129.0.0.1 | 1007 | | IPv4 Src | 128.0.0.1 | 1008 | | TCP Dst | 5000 | 1009 | | TCP Src | 80 | 1010 | --------------- | ------------ | ------------- | 1011 | IPv6 Datagram 2 | IPv6 Dst | 2001:0:0:1::1 | 1012 | | IPv6 Src | 2001:0:0:2::1 | 1013 | | IPv4 Dst | 10.0.0.1 | 1014 | | IP Src | 128.0.0.1 | 1015 | | TCP Dst | 10000 | 1016 | | TCP Src | 80 | 1017 | --------------- | ------------ | ------------- | 1018 | IPv4 datagram 3 | IPv4 Dst | 10.0.0.1 | 1019 | | IPv4 Src | 128.0.0.1 | 1020 | | TCP Dst | 10000 | 1021 | | TCP Src | 80 | 1022 +-----------------+--------------+---------------+ 1024 Datagram header contents 1026 14.1.2. Translation details 1028 The AFTR has a NAT that translates between softwire/port pairs and 1029 IPv4-address/port pairs. The same translation is applied to IPv4 1030 datagrams received on the device's external interface and from the 1031 softwire endpoint in the device. 1033 In Figure 2, the translator network interface in the AFTR is on the 1034 Internet, and the softwire interface connects to the dual-stack lite 1035 home router. The AFTR translator is configured as follows: 1037 Network interface: Translate IPv4 destination address and TCP 1038 destination port to the softwire identifier and TCP destination 1039 port 1041 Softwire interface: Translate softwire identifier and TCP source 1042 port to IPv4 source address and TCP source port 1044 Here is how the translation in Figure 3 works: 1046 o Datagram 1 is received on the AFTR translator network interface. 1047 The translator looks up the IPv4-address/port pair in its 1048 translator table, rewrites the IPv4 destination address to 1049 10.0.0.1 and the TCP source port to 10000, and hands the datagram 1050 to the SE to be forwarded over the softwire. 1052 o The IPv4 datagram is received on the dual-stack lite home router 1053 B4. The B4 function extracts the IPv4 datagram and the dual-stack 1054 lite home router forwards datagram 3 to the host. 1056 +----------------------------------+--------------------+ 1057 | Softwire-Id/IPv4/Prot/Port | IPv4/Prot/Port | 1058 +----------------------------------+--------------------+ 1059 | 2001:0:0:1::1/10.0.0.1/TCP/10000 | 129.0.0.1/TCP/5000 | 1060 +----------------------------------+--------------------+ 1062 Dual-Stack lite carrier-grade NAT translation table 1064 The Softwire-Id is the IPv6 address assigned to the Dual-Stack lite 1065 home gateway. Hosts behind the same Dual-Stack lite home router have 1066 the same Softwire-Id. The source IPv4 is the RFC1918 addressed 1067 assigned by the Dual-Stack home router which is unique to each host 1068 behind the home gateway. The AFTR would receive packets sourced from 1069 different IPv4 addresses in the same softwire tunnel. The AFTR 1070 combines the Softwire-Id and IPv4 address/Port [Softwire-Id, IPv4+ 1071 Port] to uniquely identify the host behind the same Dual-Stack lite 1072 home router. 1074 14.2. Host based architecture 1076 This architecture is targeted at new, large scale deployments of 1077 dual-stack capable devices implementing a dual-stack lite interface. 1079 Consider a scenario where a Dual-Stack lite host device is directly 1080 connected to the service provider network. The host device is dual- 1081 stack capable but only provisioned an IPv6 global address. Besides, 1082 the host device will pre-configure a well-known IPv4 non-routable 1083 address (see IANA section). This well-known IPv4 non-routable 1084 address is similar to the 127.0.0.1 loopback address. Every host 1085 device implemented Dual-Stack lite will pre-configure the same 1086 address. This address will be used to source the IPv4 datagram when 1087 the device accesses IPv4 services. Besides, the host device will 1088 create an IPv4-in-IPv6 softwire tunnel to an AFTR. The Carrier Grade 1089 NAT will reside in the service provider network. 1091 When the device accesses IPv6 service, the device will send the IPv6 1092 datagram natively to the default gateway. 1094 When the device accesses IPv4 service, it will source the IPv4 1095 datagram with the well-known non-routable IPv4 address. Then, the 1096 host device will encapsulate the IPv4 datagram inside the IPv4-in- 1097 IPv6 softwire tunnel and send the IPv6 datagram to the AFTR. When 1098 the AFTR receives the IPv6 datagram, it will decapsulate the IPv6 1099 header and perform IPv4-to-IPv4 NAT on the source address. 1101 This scenario works on both wireline and wireless networks. A 1102 typical wireless device will connect directly to the service provider 1103 without home gateway in between. 1105 As illustrated in Figure 4, this dual-stack lite deployment model 1106 consists of three components: the dual-stack lite host, the AFTR and 1107 a softwire between the softwire initiator B4 in the host and the 1108 softwire concentrator in the AFTR. The dual-stack lite host is 1109 assumed to have IPv6 service and can exchange IPv6 traffic with the 1110 AFTR. 1112 The AFTR performs IPv4-IPv4 NAT translations to multiplex multiple 1113 subscribers through a pool of global IPv4 address. Overlapping IPv4 1114 address spaces used by the dual-stack lite hosts are disambiguated 1115 through the identification of tunnel endpoints. 1117 In this situation, the dual-stack lite host configures the IPv4 1118 address 192.0.0.2 out of the well-known range 192.0.0.0/29 (defined 1119 by IANA) on its B4 interface. It also configure the first non- 1120 reserved IPv4 address of the reserved range, 192.0.0.1 as the address 1121 of its default gateway. 1123 +-------------------+ 1124 | | 1125 | Host 192.0.0.2 | 1126 |+--------+--------+| 1127 || B4 || 1128 |+--------+--------+| 1129 +--------|||--------+ 1130 |||2001:0:0:1::1 1131 ||| 1132 |||<-IPv4-in-IPv6 softwire 1133 ||| 1134 -------|||------- 1135 / ||| \ 1136 | ISP core network | 1137 \ ||| / 1138 -------|||------- 1139 ||| 1140 |||2001:0:0:2::1 1141 +--------|||--------+ 1142 | AFTR | 1143 |+--------+--------+| 1144 || Concentrator || 1145 |+--------+--------+| 1146 | |NAT| | 1147 | +-+-+ | 1148 +---------|---------+ 1149 |129.0.0.1 1150 | 1151 --------|-------- 1152 / | \ 1153 | Internet | 1154 \ | / 1155 --------|-------- 1156 | 1157 |128.0.0.1 1158 +-----+-----+ 1159 | IPv4 Host | 1160 +-----------+ 1162 Figure 4: host-based architecture 1164 The resulting solution accepts an IPv4 datagram that is translated 1165 into an IPv4-in-IPv6 softwire datagram for transmission across the 1166 softwire. At the corresponding endpoint, the IPv4 datagram is 1167 decapsulated, and the translated IPv4 address is inserted based on a 1168 translation from the softwire. 1170 14.2.1. Example message flow 1172 In the example shown in Figure 5, the translation tables in the AFTR 1173 is configured to forward between IP/TCP (a.b.c.d/10000) and IP/TCP 1174 (129.0.0.1/5000). That is, a datagram received from the host at 1175 address 192.0.0.2, using TCP DST port 10000 will be translated a 1176 datagram with IP SRC address 129.0.0.1 and TCP SRC port 5000 in the 1177 Internet. 1179 +-------------------+ 1180 | | 1181 |Host 192.0.0.2 | 1182 |+--------+--------+| 1183 || B4 || 1184 |+--------+--------+| 1185 +--------|||--------+ 1186 | |||2001:0:0:1::1 1187 IPv6 datagram 1| ||| 1188 | |||<-IPv4-in-IPv6 softwire 1189 | ||| 1190 -----|-|||------- 1191 / | ||| \ 1192 | ISP core network | 1193 \ | ||| / 1194 -----|-|||------- 1195 | ||| 1196 | |||2001:0:0:2::1 1197 +------|-|||--------+ 1198 | | AFTR | 1199 | v ||| | 1200 |+--------+--------+| 1201 || Concentrator || 1202 |+--------+--------+| 1203 | |NAT| | 1204 | +-+-+ | 1205 +---------|---------+ 1206 | |129.0.0.1 1207 IPv4 datagram 2 | | 1208 -----|--|-------- 1209 / | | \ 1210 | Internet | 1211 \ | | / 1212 -----|--|-------- 1213 | | 1214 v |128.0.0.1 1215 +-----+-----+ 1216 | IPv4 Host | 1217 +-----------+ 1219 Figure 5: Outbound Datagram 1221 +-----------------+--------------+---------------+ 1222 | Datagram | Header field | Contents | 1223 +-----------------+--------------+---------------+ 1224 | IPv6 Datagram 1 | IPv6 Dst | 2001:0:0:2::1 | 1225 | | IPv6 Src | 2001:0:0:1::1 | 1226 | | IPv4 Dst | 128.0.0.1 | 1227 | | IPv4 Src | a.b.c.d | 1228 | | TCP Dst | 80 | 1229 | | TCP Src | 10000 | 1230 | --------------- | ------------ | ------------- | 1231 | IPv4 datagram 2 | IPv4 Dst | 128.0.0.1 | 1232 | | IPv4 Src | 129.0.0.1 | 1233 | | TCP Dst | 80 | 1234 | | TCP Src | 5000 | 1235 +-----------------+--------------+---------------+ 1237 Datagram header contents 1239 When sending an IPv4 packet, the dual-stack lite host encapsulates it 1240 in datagram 1 and forwards it to the AFTR over the softwire. 1242 When it receives datagram 1, the concentrator in the AFTR hands the 1243 IPv4 datagram to the NAT, which determines from its translation table 1244 that the datagram received on Softwire_1 with TCP SRC port 10000 1245 should be translated to datagram 3 with IP SRC address 129.0.0.1 and 1246 TCP SRC port 5000. 1248 Figure 6 shows an inbound message received at the AFTR. When the NAT 1249 function in the AFTR receives datagram 1, it looks up the IP/TCP DST 1250 in its translation table. In the example in Figure 3, the NAT 1251 translates the TCP DST port to 10000, sets the IP DST address to 1252 a.b.c.d and hands the datagram to the concentrator for transmission 1253 over Softwire_1. The B4 in the dual-stack lite hosts decapsulates 1254 IPv4 datagram from the inbound softwire datagram, and forwards it to 1255 the host. 1257 +-------------------+ 1258 | | 1259 |Host 192.0.0.2 | 1260 |+--------+--------+| 1261 || B4 || 1262 |+--------+--------+| 1263 +--------|||--------+ 1264 ^ |||2001:0:0:1::1 1265 IPv6 datagram 2 | ||| 1266 | |||<-IPv4-in-IPv6 softwire 1267 | ||| 1268 -----|-|||------- 1269 / | ||| \ 1270 | ISP core network | 1271 \ | ||| / 1272 -----|-|||------- 1273 | ||| 1274 | |||2001:0:0:2::1 1275 +------|-|||--------+ 1276 | AFTR | 1277 | | ||| | 1278 |+--------+--------+| 1279 || Concentrator || 1280 |+--------+--------+| 1281 | |NAT| | 1282 | +-+-+ | 1283 +---------|---------+ 1284 ^ |129.0.0.1 1285 IPv4 datagram 1 | | 1286 -----|--|-------- 1287 / | | \ 1288 | Internet | 1289 \ | | / 1290 -----|--|-------- 1291 | | 1292 | |128.0.0.1 1293 +-----+-----+ 1294 | IPv4 Host | 1295 +-----------+ 1297 Figure 6: Inbound Datagram 1299 +-----------------+--------------+---------------+ 1300 | Datagram | Header field | Contents | 1301 +-----------------+--------------+---------------+ 1302 | IPv4 datagram 1 | IPv4 Dst | 129.0.0.1 | 1303 | | IPv4 Src | 128.0.0.1 | 1304 | | TCP Dst | 5000 | 1305 | | TCP Src | 80 | 1306 | --------------- | ------------ | ------------- | 1307 | IPv6 Datagram 2 | IPv6 Dst | 2001:0:0:1::1 | 1308 | | IPv6 Src | 2001:0:0:2::1 | 1309 | | IPv4 Dst | a.b.c.d | 1310 | | IP Src | 128.0.0.1 | 1311 | | TCP Dst | 10000 | 1312 | | TCP Src | 80 | 1313 +-----------------+--------------+---------------+ 1315 Datagram header contents 1317 14.2.2. Translation details 1319 The translations happening in the AFTR are the same as in the 1320 previous examples. The well known IPv4 address 192.0.0.2 out of the 1321 192.0.0.0/29 (defined by IANA) range used by all the hosts are 1322 disambiguated by the IPv6 source address of the softwire. 1324 +---------------------------------+--------------------+ 1325 | Softwire-Id/IPv4/Prot/Port | IPv4/Prot/Port | 1326 +---------------------------------+--------------------+ 1327 | 2001:0:0:1::1/a.b.c.d/TCP/10000 | 129.0.0.1/TCP/5000 | 1328 +---------------------------------+--------------------+ 1330 Dual-Stack lite carrier-grade NAT translation table 1332 The Softwire-Id is the IPv6 address assigned to the Dual-Stack host. 1333 Each host has an unique Softwire-Id. The source IPv4 address is one 1334 of the well-known IPv4 address. The AFTR could receive packets from 1335 different hosts sourced from the same IPv4 well-known address from 1336 different softwire tunnels. Similar to the gateway architecture, the 1337 AFTR combines the Softwire-Id and IPv4 address/Port [Softwire-Id, 1338 IPv4+Port] to uniquely identify the individual host. 1340 15. Appendix C: Deployment considerations 1342 15.1. AFTR service distribution and horizontal scaling 1344 One of the key benefits of the dual-stack lite technology lies in the 1345 fact it is tunnel based. That is, tunnel end-points may be anywhere 1346 in the service provider network. 1348 Using the DHCPv6 tunnel end-point option, service providers can 1349 create groups of users sharing the same AFTR. Those groups can be 1350 merged or divided at will. This leads to an horizontally scaled 1351 solution, where more capacity is added simply by adding more boxes. 1352 As those groups of users can evolve over time, it is best to make 1353 sure that AFTRs do not require per-user configuration in order to 1354 provide service. 1356 15.2. Horizontal scaling 1358 A service provider can start using just a few AFTR centrally located. 1359 Later, when more capacity is needed, more boxes can be added and 1360 pushed to the edges of the access network. In case of a spike of 1361 traffic, for example during the Olympic games or an important 1362 political event, capacity can be quickly added in any location of the 1363 network (tunnels can terminate anywhere) simply by splitting user 1364 groups. Extra capacity can be later removed when the traffic returns 1365 to normal by resetting the DHCPv6 tunnel end-point settings. 1367 15.3. High availability 1369 An important element in the design of the dual-stack lite technology 1370 is the simplicity of implementation on the customer side. A simple 1371 IP4-in-IPv6 tunnel and a default route over it is all is needed to 1372 get IPv4 connectivity. Dealing with high availability is the 1373 responsibility of the service provider, not the customer devices 1374 implementing dual-stack lite. As such, a single IPv6 address of the 1375 tunnel end-point is provided in the DHCPv6 option defined in 1376 [I-D.ietf-softwire-ds-lite-tunnel-option]. The service provider can 1377 use techniques such as anycast or various types of clusters to ensure 1378 availability of the IPv4 service. The exact synchronization (or lack 1379 thereof) between redundant AFTRs is out of scope for this document. 1381 15.4. Logging 1383 DS-Lite AFTR implementation should offer the possility to log NAT 1384 binding creations or other ways to keep track of the ports/IP 1385 addresses used by customers. This is both to support 1386 troubleshooting, which is very important to service providers trying 1387 to figure out why something may not be working, as well as to meet 1388 region-specific requirements for responding to legally-binding 1389 requests for information from law enforcement authorities. 1391 16. References 1392 16.1. Normative references 1394 [I-D.ietf-softwire-ds-lite-tunnel-option] 1395 Hankins, D. and T. Mrugalski, "Dynamic Host Configuration 1396 Protocol for IPv6 (DHCPv6) Options for Dual- Stack Lite", 1397 draft-ietf-softwire-ds-lite-tunnel-option-02 (work in 1398 progress), March 2010. 1400 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1401 Requirement Levels", BCP 14, RFC 2119, March 1997. 1403 [RFC2473] Conta, A. and S. Deering, "Generic Packet Tunneling in 1404 IPv6 Specification", RFC 2473, December 1998. 1406 [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, 1407 "Definition of the Differentiated Services Field (DS 1408 Field) in the IPv4 and IPv6 Headers", RFC 2474, 1409 December 1998. 1411 [RFC4213] Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms 1412 for IPv6 Hosts and Routers", RFC 4213, October 2005. 1414 [RFC5625] Bellis, R., "DNS Proxy Implementation Guidelines", 1415 BCP 152, RFC 5625, August 2009. 1417 16.2. Informative references 1419 [I-D.bajko-v6ops-port-restricted-ipaddr-assign] 1420 Bajko, G. and T. Savolainen, "Port Restricted IP Address 1421 Assignment", 1422 draft-bajko-v6ops-port-restricted-ipaddr-assign-02 (work 1423 in progress), November 2008. 1425 [I-D.cheshire-nat-pmp] 1426 Cheshire, S., "NAT Port Mapping Protocol (NAT-PMP)", 1427 draft-cheshire-nat-pmp-03 (work in progress), April 2008. 1429 [I-D.droms-softwires-snat] 1430 Droms, R. and B. Haberman, "Softwires Network Address 1431 Translation (SNAT)", draft-droms-softwires-snat-01 (work 1432 in progress), July 2008. 1434 [I-D.durand-dual-stack-lite] 1435 Durand, A., "Dual-stack lite broadband deployments post 1436 IPv4 exhaustion", draft-durand-dual-stack-lite-00 (work in 1437 progress), July 2008. 1439 [I-D.ford-shared-addressing-issues] 1440 Ford, M., Boucadair, M., Durand, A., Levis, P., and P. 1441 Roberts, "Issues with IP Address Sharing", 1442 draft-ford-shared-addressing-issues-02 (work in progress), 1443 March 2010. 1445 [I-D.nishitani-cgn] 1446 Nishitani, T., Yamagata, I., Miyakawa, S., Nakagawa, A., 1447 and H. Ashida, "Common Functions of Large Scale NAT 1448 (LSN)", draft-nishitani-cgn-03 (work in progress), 1449 November 2009. 1451 [I-D.templin-seal] 1452 Templin, F., "The Subnetwork Encapsulation and Adaptation 1453 Layer (SEAL)", draft-templin-seal-23 (work in progress), 1454 August 2008. 1456 [I-D.ymbk-aplusp] 1457 Bush, R., "The A+P Approach to the IPv4 Address Shortage", 1458 draft-ymbk-aplusp-05 (work in progress), October 2009. 1460 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 1461 November 1990. 1463 [RFC1918] Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G., and 1464 E. Lear, "Address Allocation for Private Internets", 1465 BCP 5, RFC 1918, February 1996. 1467 [RFC2663] Srisuresh, P. and M. Holdrege, "IP Network Address 1468 Translator (NAT) Terminology and Considerations", 1469 RFC 2663, August 1999. 1471 [RFC2993] Hain, T., "Architectural Implications of NAT", RFC 2993, 1472 November 2000. 1474 [RFC4787] Audet, F. and C. Jennings, "Network Address Translation 1475 (NAT) Behavioral Requirements for Unicast UDP", BCP 127, 1476 RFC 4787, January 2007. 1478 [RFC5382] Guha, S., Biswas, K., Ford, B., Sivakumar, S., and P. 1479 Srisuresh, "NAT Behavioral Requirements for TCP", BCP 142, 1480 RFC 5382, October 2008. 1482 [RFC5508] Srisuresh, P., Ford, B., Sivakumar, S., and S. Guha, "NAT 1483 Behavioral Requirements for ICMP", BCP 148, RFC 5508, 1484 April 2009. 1486 [RFC5571] Storer, B., Pignataro, C., Dos Santos, M., Stevant, B., 1487 Toutain, L., and J. Tremblay, "Softwire Hub and Spoke 1488 Deployment Framework with Layer Two Tunneling Protocol 1489 Version 2 (L2TPv2)", RFC 5571, June 2009. 1491 [UPnP-IGD] 1492 UPnP Forum, "Universal Plug and Play Internet Gateway 1493 Device Standardized Gateway Device Protocol", 1494 September 2006, 1495 . 1497 Author's Address 1499 Alain Durand (editor) 1500 Comcast 1501 1, Comcast center 1502 Philadelphia, PA 19103 1503 USA 1505 Email: alain_durand@cable.comcast.com