idnits 2.17.1 draft-ietf-softwire-dual-stack-lite-10.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 20 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 8 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: However, as not all service providers will be able to increase their link MTU, the B4 element MUST perform fragmentation and reassembly if the outgoing link MTU cannot accommodate for the extra IPv6 header. The original IPv4 packet is not oversized. The packet is oversized after the IPv6 encapsulation. The inner IPv4 packet MUST not be fragmented. Fragmentation MUST happen after the encapsulation of the IPv6 packet. Reassembly MUST happen before the decapsulation of the IPv4 packet. Detailed procedure has been specified in [RFC2473] Section 7.2. -- The document date (May 13, 2011) is 4704 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-10) exists of draft-ietf-behave-lsn-requirements-01 == Outdated reference: A later version (-29) exists of draft-ietf-pcp-base-10 Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force A. Durand 3 Internet-Draft Juniper Networks 4 Intended status: Standards Track R. Droms 5 Expires: November 14, 2011 Cisco 6 J. Woodyatt 7 Apple 8 Y. Lee 9 Comcast 10 May 13, 2011 12 Dual-Stack Lite Broadband Deployments Following IPv4 Exhaustion 13 draft-ietf-softwire-dual-stack-lite-10 15 Abstract 17 This document revisits the dual-stack model and introduces the Dual- 18 Stack Lite technology aimed at better aligning the costs and benefits 19 of deploying IPv6 in service provider networks. Dual-Stack Lite 20 enables a broadband service provider to share IPv4 addresses among 21 customers by combining two well-known technologies: IP in IP (IPv4- 22 in-IPv6) and Network Address Translation (NAT). 24 Status of this Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at http://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on November 14, 2011. 41 Copyright Notice 43 Copyright (c) 2011 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (http://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 59 2. Requirements language . . . . . . . . . . . . . . . . . . . . 4 60 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 61 4. Deployment scenarios . . . . . . . . . . . . . . . . . . . . . 5 62 4.1. Access model . . . . . . . . . . . . . . . . . . . . . . . 5 63 4.2. CPE . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 64 4.3. Directly connected device . . . . . . . . . . . . . . . . 7 65 5. B4 element . . . . . . . . . . . . . . . . . . . . . . . . . . 7 66 5.1. Definition . . . . . . . . . . . . . . . . . . . . . . . . 7 67 5.2. Encapsulation . . . . . . . . . . . . . . . . . . . . . . 7 68 5.3. Fragmentation and Reassembly . . . . . . . . . . . . . . . 7 69 5.4. AFTR discovery . . . . . . . . . . . . . . . . . . . . . . 8 70 5.5. DNS . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 71 5.6. Interface Initialization . . . . . . . . . . . . . . . . . 9 72 5.7. Well-known IPv4 address . . . . . . . . . . . . . . . . . 9 73 6. AFTR element . . . . . . . . . . . . . . . . . . . . . . . . . 9 74 6.1. Definition . . . . . . . . . . . . . . . . . . . . . . . . 9 75 6.2. Encapsulation . . . . . . . . . . . . . . . . . . . . . . 9 76 6.3. Fragmentation and Reassembly . . . . . . . . . . . . . . . 9 77 6.4. DNS . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 78 6.5. Well-known IPv4 address . . . . . . . . . . . . . . . . . 10 79 6.6. Extended binding table . . . . . . . . . . . . . . . . . . 11 80 7. Network Considerations . . . . . . . . . . . . . . . . . . . . 11 81 7.1. Tunneling . . . . . . . . . . . . . . . . . . . . . . . . 11 82 7.2. Multicast considerations . . . . . . . . . . . . . . . . . 11 83 8. NAT considerations . . . . . . . . . . . . . . . . . . . . . . 11 84 8.1. NAT pool . . . . . . . . . . . . . . . . . . . . . . . . . 11 85 8.2. NAT conformance . . . . . . . . . . . . . . . . . . . . . 11 86 8.3. Application Level Gateways (ALG) . . . . . . . . . . . . . 12 87 8.4. Sharing global IPv4 addresses . . . . . . . . . . . . . . 12 88 8.5. Port forwarding / Keep alive . . . . . . . . . . . . . . . 12 89 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 12 90 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 91 11. Security Considerations . . . . . . . . . . . . . . . . . . . 13 92 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 14 93 12.1. Normative references . . . . . . . . . . . . . . . . . . . 14 94 12.2. Informative references . . . . . . . . . . . . . . . . . . 14 95 Appendix A. Deployment considerations . . . . . . . . . . . . . . 16 96 A.1. AFTR service distribution and horizontal scaling . . . . . 16 97 A.2. Horizontal scaling . . . . . . . . . . . . . . . . . . . . 16 98 A.3. High availability . . . . . . . . . . . . . . . . . . . . 17 99 A.4. Logging . . . . . . . . . . . . . . . . . . . . . . . . . 17 100 Appendix B. Examples . . . . . . . . . . . . . . . . . . . . . . 17 101 B.1. Gateway based architecture . . . . . . . . . . . . . . . . 17 102 B.1.1. Example message flow . . . . . . . . . . . . . . . . . 20 103 B.1.2. Translation details . . . . . . . . . . . . . . . . . 24 104 B.2. Host based architecture . . . . . . . . . . . . . . . . . 25 105 B.2.1. Example message flow . . . . . . . . . . . . . . . . . 28 106 B.2.2. Translation details . . . . . . . . . . . . . . . . . 32 107 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 32 109 1. Introduction 111 The common thinking for more than 10 years has been that the 112 transition to IPv6 will be based solely on the dual stack model and 113 that most things would be converted this way before we ran out of 114 IPv4. However, this has not happened. The IANA free pool of IPv4 115 addresses has now depleted, well before sufficient IPv6 deployment 116 had taken place. As a result, many IPv4 services have to continue to 117 be provided even under severely limited address space. 119 This document specifies the Dual-Stack Lite technology which is aimed 120 at better aligning the costs and benefits in service provider 121 networks. Dual-Stack Lite will enable both continued support for 122 IPv4 services and incentives for the deployment of IPv6. It also de- 123 couples IPv6 deployment in the service provider network from the rest 124 of the Internet, making incremental deployment easier. 126 Dual-Stack Lite enables a broadband service provider to share IPv4 127 addresses among customers by combining two well-known technologies: 128 IP in IP (IPv4-in-IPv6) and NAT. 130 This document makes a distinction between a dual-stack capable and a 131 dual-stack provisioned device. The former is a device that has code 132 that implements both IPv4 and IPv6, from the network layer to the 133 applications. The latter is a similar device that has been 134 provisioned with both an IPv4 and an IPv6 address on its 135 interface(s). This document will also further refine this notion by 136 distinguishing between interfaces provisioned directly by the service 137 provider from those provisioned by the customer. 139 Pure IPv6-only devices (i.e. devices that do not include an IPv4 140 stack) are outside of the scope of this document. 142 This document will first present some deployment scenario and then 143 define the behavior of the two elements of the Dual-Stack Lite 144 technology: the B4 and the AFTR. It will then go into networking and 145 NAT-ing considerations. 147 2. Requirements language 149 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 150 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 151 document are to be interpreted as described in RFC 2119 [RFC2119]. 153 3. Terminology 155 The technology described in this document is known as Dual-Stack 156 Lite. The abbreviation DS-Lite will be used along this text. 158 This document also introduces two new terms: the DS-Lite Basic 159 Bridging BroadBand element (B4) and the DS-Lite Address Family 160 Transition Router element (AFTR). 162 Dual-stack is defined in [RFC4213]. 164 NAT related terminology is defined in [RFC4787]. 166 CPE stands for Customer Premise Equipment. This is the layer 3 167 device in the customer premise that is connected to the service 168 provider network. That device is often a home gateway. However, 169 sometimes computers are directly attached to the service provider 170 network. In such cases, such computers can be viewed as CPEs as 171 well. 173 4. Deployment scenarios 175 4.1. Access model 177 Instead of relying on a cascade of NATs, the Dual-Stack Lite model is 178 built on IPv4-in-IPv6 tunnels to cross the network to reach a 179 carrier-grade IPv4-IPv4 NAT (the AFTR) where customers will share 180 IPv4 addresses. There are numbers of benefits to this approach: 182 o This technology decouples the deployment of IPv6 in the service 183 provider network (up to the customer premise equipment or CPE) 184 from the deployment of IPv6 in the global Internet and in customer 185 applications and devices. 187 o The management of the service provider access networks is 188 simplified by leveraging the large IPv6 address space. 189 Overlapping private IPv4 address spaces are not required to 190 support very large customer bases. 192 o As tunnels can terminate anywhere in the service provider network, 193 this architecture leads itself to horizontal scaling and provides 194 great flexibility to adapt to changing traffic load. More 195 discussion of horizontal scaling can be found in Appendix A. 197 o Tunnels provide a direct connection between B4 and the AFTR. This 198 can be leveraged to enable customers and their applications to 199 control how the NAT function of the AFTR is performed. 201 A key characteristic of this approach is that communications between 202 end-nodes stay within their address family. IPv6 sources only 203 communicate with IPv6 destinations, IPv4 sources only communicate 204 with IPv4 destinations. There is no protocol family translation 205 involved in this approach. This simplifies greatly the task of 206 applications that may carry literal IP addresses in their payload. 208 4.2. CPE 210 This section describes home Local Area networks characterized by the 211 presence of a home gateway, or CPE, provisioned only with IPv6 by the 212 service provider. 214 A DS-Lite CPE is an IPv6 aware CPE with a B4 Interface implemented in 215 the WAN interface. 217 A DS-Lite CPE SHOULD NOT operate a NAT function between an internal 218 interface and a B4 interface, as the NAT function will be performed 219 by the AFTR in the service provider's network. That will avoid 220 accidentally operating in a double NAT environment. 222 However, it SHOULD operate its own DHCP(v4) server handing out 223 [RFC1918] address space (e.g. 192.168.0.0/16) to hosts in the home. 224 It SHOULD advertise itself as the default IPv4 router to those home 225 hosts. It SHOULD also advertise itself as a DNS server in the DHCP 226 Option 6 (DNS Server). Additionally, it SHOULD operate a DNS proxy 227 to accept DNS IPv4 requests from home hosts and send them using IPv6 228 to the service provider DNS servers, as described in Section 5.5. 230 Note: if an IPv4 home host decides to use another IPv4 DNS server, 231 the DS-Lite CPE will forward those DNS requests via the B4 interface, 232 the same way it forwards any regular IPv4 packets. However, each DNS 233 request will create a binding in the AFTR. A large number of DNS 234 requests may have direct impact to the AFTR's NAT table utilization. 236 IPv6 capable devices directly reach the IPv6 Internet. Packets 237 simply follow IPv6 routing, they do not go through the tunnel, and 238 are not subject to any translation. It is expected that most IPv6 239 capable devices will also be IPv4 capable and will simply be 240 configured with an IPv4 RFC1918 style address within the home network 241 and access the IPv4 Internet the same way as the legacy IPv4-only 242 devices within the home. 244 Pure IPv6-only devices (i.e. devices that do not include an IPv4 245 stack) are outside of the scope of this document. 247 4.3. Directly connected device 249 In broadband home networks, some devices are directly connected to 250 the broadband service provider. They are connected straight to a 251 modem, without a home gateway. Those devices are, in fact, acting as 252 CPEs. 254 Under this scenario, the customer device is a dual-stack capable host 255 that is only provisioned by the service provider with IPv6 only. The 256 device itself acts as a B4 element and the IPv4 service is provided 257 by an IPv4-in-IPv6 tunnel, just as in the home gateway/CPE case. 258 That device can run any combinations of IPv4 and/or IPv6 259 applications. 261 A directly connected DS-Lite device SHOULD send its DNS requests over 262 IPv6 to the IPv6 DNS server it has been configured to use. 264 Similarly to the previous sections, IPv6 packets follow IPv6 routing, 265 they do not go through the tunnel, and are not subject to any 266 translation. 268 The support of IPv4-only devices and IPv6-only devices in this 269 scenario is out of scope for this document. 271 5. B4 element 273 5.1. Definition 275 The B4 element is a function implemented on a dual-stack capable 276 node, either a directly connected device or a CPE, that creates a 277 tunnel to an AFTR. 279 5.2. Encapsulation 281 The tunnel is a multi-point to point IPv4-in-IPv6 tunnel ending on a 282 service provider AFTR. 284 See section 7.1 for additional tunneling considerations. 286 Note: at this point, DS-Lite only defines IPv4-in-IPv6 tunnels, 287 however other types of encapsulation could be defined in the future. 289 5.3. Fragmentation and Reassembly 291 Using an encapsulation (IPv4-in-IPv6 or anything else) to carry IPv4 292 traffic over IPv6 will reduce the effective MTU of the datagram. 293 Unfortunately, path MTU discovery [RFC1191] is not a reliable method 294 to deal with this problem. 296 A solution to deal with this problem is for the service provider to 297 increase the MTU size of all the links between the B4 element and the 298 AFTR elements by at least 40 bytes to accommodate both the IPv6 299 encapsulation header and the IPv4 datagram without fragmenting the 300 IPv6 packet. 302 However, as not all service providers will be able to increase their 303 link MTU, the B4 element MUST perform fragmentation and reassembly if 304 the outgoing link MTU cannot accommodate for the extra IPv6 header. 305 The original IPv4 packet is not oversized. The packet is oversized 306 after the IPv6 encapsulation. The inner IPv4 packet MUST not be 307 fragmented. Fragmentation MUST happen after the encapsulation of the 308 IPv6 packet. Reassembly MUST happen before the decapsulation of the 309 IPv4 packet. Detailed procedure has been specified in [RFC2473] 310 Section 7.2. 312 5.4. AFTR discovery 314 In order to configure the IPv4-in-IPv6 tunnel, the B4 element needs 315 the IPv6 address of the AFTR element. This IPv6 address can be 316 configured using a variety of methods, ranging from an out-of-band 317 mechanism, manual configuration or a variety of DHCPv6 options. 319 In order to guarantee interoperability, a B4 element SHOULD implement 320 the DHCPv6 option defined in 321 [I-D.ietf-softwire-ds-lite-tunnel-option]. 323 5.5. DNS 325 A B4 element is only configured from the service provider with IPv6. 326 As such, it can only learn the address of a DNS recursive server 327 through DHCPv6 (or other similar method over IPv6). As DHCPv6 only 328 defines an option to get the IPv6 address of such a DNS recursive 329 server, the B4 element cannot easily discover the IPv4 address of 330 such a recursive DNS server, and as such will have to perform all DNS 331 resolution over IPv6. 333 The B4 element can pass this IPv6 address to downstream IPv6 nodes, 334 but not to downstream IPv4 nodes. As such, the B4 element SHOULD 335 implement a DNS proxy, following the recommendations of [RFC5625]. 337 To support security-aware resolver behind the B4 element, the DNS 338 proxy in the B4 element must be also security-aware. Details can be 339 found in [RFC4033] Section 6. 341 5.6. Interface Initialization 343 B4 element can be implemented in a host and CPE in conjunction with 344 other technologies such as native dual-stack. The host and the CPE 345 SHOULD select to start only one technology during initialization. 346 For example: if the CPE selects to start in native dual-stack mode, 347 it SHOULD NOT initialize the B4 element. This selection process is 348 out-of-scope for this document. 350 5.7. Well-known IPv4 address 352 Any locally unique IPv4 address could be configured on the IPv4-in- 353 IPv6 tunnel to represent the B4 element. Configuring such an address 354 is often necessary when the B4 element is sourcing IPv4 datagrams 355 directly over the tunnel. In order to avoid conflicts with any other 356 address, IANA has defined a well-known range, 192.0.0.0/29. 358 192.0.0.0 is the reserved subnet address. 192.0.0.1 is reserved for 359 the AFTR element and 192.0.0.2 is reserved for the B4 element. If a 360 service provider has special configuration which prevents the B4 361 element from using 192.0.0.2. the B4 element MAY use any other 362 addresses within the 192.0.0.0/29. 364 Note: a range of addresses has been reserved for this purpose. The 365 intent is to accommodate nodes implementing multiple B4 elements. 367 6. AFTR element 369 6.1. Definition 371 An AFTR element is the combination of an IPv4-in-IPv6 tunnel end- 372 point and an IPv4-IPv4 NAT implemented on the same node. 374 6.2. Encapsulation 376 The tunnel is a point-to-multipoint IPv4-in-IPv6 tunnel ending at the 377 B4 elements. 379 See section 7.1 for additional tunneling considerations. 381 Note: at this point, DS-Lite only defines IPv4-in-IPv6 tunnels, 382 however other types of encapsulation could be defined in the future. 384 6.3. Fragmentation and Reassembly 386 As noted previously, fragmentation and reassembly need to be taken 387 care of by the tunnel end-points. As such, the AFTR MUST perform 388 fragmentation and reassembly if the underlying link MTU cannot 389 accommodate the encapsulation overhead. Fragmentation MUST happen 390 after the encapsulation on the IPv6 packet. Reassembly MUST happen 391 before the decapsulation of the IPv6 header. Detailed procedure has 392 been specified in [RFC2473] Section 7.2. 394 Fragmentation at the Tunnel Entry-Point is a light-weight operation. 395 In contrast, reassembly at the Tunnel Exit-Point can be expensive. 396 When the Tunnel Exit-Point receives the first fragmented packet, it 397 must wait for the second fragmented packet to arrive in order to 398 reassemble the two fragmented IPv6 packets for decapsulation. This 399 requires the Tunnel Exit-Point to buffer and keep track of fragmented 400 packets. Consider that the AFTR is the Tunnel Exit-Point for many 401 tunnels. If many devices simultaneously source large number of 402 fragmented packets through the AFTR to its managed B4 elements, this 403 will require the AFTR to buffer and consume enormous resources to 404 keep track of the flows. This reassembly process will significantly 405 impact the AFTR performance. However, this impact only happens when 406 many clients simultaneously source large IPv4 packets. Since we 407 believe that majority of the clients will receive large IPv4 packets 408 (such as watching video streams) instead of sourcing large IPv4 409 packets (such as sourcing video streams), so reassembly is only a 410 fraction of the overall AFTR's workload. 412 When AFTR's resources are running below a pre-defined threshold, it 413 SHOULD generate a notification to the administrator before the 414 resources are completely exhausted. The threshold and notification 415 procedures are implementation dependent and are out-of-scope for this 416 document. 418 Methods to avoid fragmentation, such as rewriting the TCP MSS option 419 or using technologies such as Subnetwork Encapsulation and Adaptation 420 Layer defined in [RFC5320] are out of scope for this document. 422 6.4. DNS 424 As noted previously, DS-Lite node implementing a B4 elements will 425 perform DNS resolution over IPv6. As a result, DNS packets are not 426 expected to go through the AFTR element. 428 6.5. Well-known IPv4 address 430 The AFTR SHOULD use the well-known IPv4 address 192.0.0.1 reserved by 431 IANA to configure the IPv4-in-IPv6 tunnel. That address can then be 432 used to report ICMP problems and will appear in traceroute outputs. 434 6.6. Extended binding table 436 The NAT binding table of the AFTR element is extended to include the 437 source IPv6 address of the incoming packets. This IPv6 address is 438 used to disambiguate between the overlapping IPv4 address space of 439 the service provider customers. 441 By doing a reverse look-up in the extended IPv4 NAT binding table, 442 the AFTR knows how to reconstruct the IPv6 encapsulation when the 443 packets comes back from the Internet. That way, there is no need to 444 keep a static configuration for each tunnel. 446 7. Network Considerations 448 7.1. Tunneling 450 Tunneling MUST be done in accordance to [RFC2473] and [RFC4213]. 451 Traffic classes ([RFC2474]) from the IPv4 headers MUST be carried 452 over to the IPv6 headers and vice versa. 454 7.2. Multicast considerations 456 Multicast is out-of-scope in this document. 458 8. NAT considerations 460 8.1. NAT pool 462 The AFTR MAY be provisioned with different NAT pools. The address 463 ranges in the pools may be disjoint but MUST NOT be overlapped. 464 Operators may implement policies in the AFTR to assign clients in 465 different pools. For example, a AFTR can have two interfaces. Each 466 interface will have a disjoint pool NAT assigned to it. In another 467 case, a policy can apply to the AFTR that a set of B4s will use NAT 468 pool 1 and a different set of B4s will use NAT pool 2. 470 8.2. NAT conformance 472 A Dual-Stack Lite AFTR MUST implement behavior conforming to the best 473 current practice, currently documented in [RFC4787], [RFC5508], and 474 [RFC5382]. More discussions about carrier-grade NATs can be found in 475 [I-D.ietf-behave-lsn-requirements]. 477 8.3. Application Level Gateways (ALG) 479 AFTR performs NAT-44 and inherits the limitations of NAT. Some 480 protocols require ALGs in the NAT device to traverse through the NAT. 481 For example: Active FTP requires ALG to work properly. ALGs consume 482 resources and there are many different types of ALGs. The AFTR is a 483 shared network device that supports a large number of B4 elements. 484 It is impossible for the AFTR to implement every current and future 485 ALGs. 487 8.4. Sharing global IPv4 addresses 489 AFTR shares a single IP with multiple users. This helps to increase 490 the IPv4 address utilization. However, it also brings some issues 491 such as logging and lawful intercept. More considerations on sharing 492 the port space of IPv4 addresses can be found in 493 [I-D.ietf-intarea-shared-addressing-issues]. 495 8.5. Port forwarding / Keep alive 497 PCP working group is standardizing a control plane to the carrier- 498 grade NAT [I-D.ietf-behave-lsn-requirements] in IETF. Port Control 499 Protocol (PCP) enables applications to directly negotiate with the 500 NAT to open ports and negotiate lifetime values to avoid keep-alive 501 traffic. More on PCP can be found in [I-D.ietf-pcp-base]. 503 9. Acknowledgements 505 The authors would like to acknowledge the role of Mark Townsley for 506 his input on the overall architecture of this technology by pointing 507 this work in the direction of [I-D.droms-softwires-snat]. Note that 508 this document results from a merging of [I-D.durand-dual-stack-lite] 509 and [I-D.droms-softwires-snat].Also to be acknowledged are the many 510 discussions with a number of people including Shin Miyakawa, 511 Katsuyasu Toyama, Akihide Hiura, Takashi Uematsu, Tetsutaro Hara, 512 Yasunori Matsubayashi, Ichiro Mizukoshi. The author would also like 513 to thank David Ward, Jari Arkko, Thomas Narten and Geoff Huston for 514 their constructive feedback. Special thanks go to Dave Thaler and 515 Dan Wing for their reviews and comments. 517 10. IANA Considerations 519 This draft request IANA to allocate a well know IPv4 192.0.0.0/29 520 network prefix. That range is used to number the Dual-Stack Lite 521 interfaces. Reserving a /29 allows for 6 possible interfaces on a 522 multi-home node. The IPv4 address 192.0.0.1 is reserved as the IPv4 523 address of the default router for such Dual-Stack Lite hosts. 525 11. Security Considerations 527 Security issues associated with NAT have long been documented. See 528 [RFC2663] and [RFC2993]. 530 However, moving the NAT functionality from the CPE to the core of the 531 service provider network and sharing IPv4 addresses among customers 532 create additional requirements when logging data for abuse usage. 533 With any architecture where an IPv4 address does not uniquely 534 represent an end host, IPv4 addresses and a timestamps are no longer 535 sufficient to identify a particular broadband customer. The AFTR 536 should have the capability to log the tunnel-id, protocol, ports/IP 537 addresses, and the creation time of the NAT binding to uniquely 538 identify the user sessions. Exact details of what is logged are 539 implementation specific and out of scope for this document. 541 The AFTR performs translation functions for interior IPv4 hosts using 542 RFC 1918 addresses or the IANA reserved address range (TBA by IANA). 543 In some circumstances, ISP may provision policies in the AFTR and 544 instructs the AFTR to bypass translation functions based on . When the AFTR receives a packet 546 with matching information of the policy from the interior host, the 547 AFTR can simply forward without translation. The addresses, ports 548 and protocols information must be provisioned on the AFTR before 549 receiving the packet. The provisioning mechanism is out-of-scope of 550 this specification. 552 When decapsulating packets, the AFTR MUST only forward packets 553 sourced by RFC 1918 addresses, IANA reserved address range, or any 554 other out-of-band pre-authorized addresses. The AFTR MUST drop all 555 others packets. This prevents rogue devices from launching denial of 556 service attacks using unauthorized public IPv4 addresses in the IPv4 557 source header field or unauthorized transport port range in the IPv4 558 transport header field. For example, rogue devices could bombard a 559 public web server by launching a TCP SYN ACK attack [RFC4987]. The 560 victim will receive TCP SYN from random IPv4 source addresses at a 561 rapid rate and deny TCP services to legitimate users. 563 With IPv4 addresses shared by multiple users, ports become a critical 564 resource. As such, some mechanisms need to be put in place by an 565 AFTR to limit port usage, either by rate-limiting new connections or 566 putting a hard limit on the maximum number of port usable by a single 567 user. If this number is high enough, it should not interfere with 568 normal usage and still provide reasonable protection of the shared 569 pool. More considerations on sharing IPv4 addresses can be found in 571 [I-D.ietf-intarea-shared-addressing-issues]. Other considerations 572 and recommendations on logging can be found in 573 [I-D.ietf-intarea-server-logging-recommendations]. 575 AFTRs should support ways to limit service only to registered 576 customers. One simple option is to implement IPv6 ingress filter on 577 the AFTR's tunnel interface to accept only the IPv6 address range 578 defined in the filter. 580 12. References 582 12.1. Normative references 584 [I-D.ietf-softwire-ds-lite-tunnel-option] 585 Hankins, D. and T. Mrugalski, "Dynamic Host Configuration 586 Protocol for IPv6 (DHCPv6) Option for Dual- Stack Lite", 587 draft-ietf-softwire-ds-lite-tunnel-option-10 (work in 588 progress), March 2011. 590 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 591 Requirement Levels", BCP 14, RFC 2119, March 1997. 593 [RFC2473] Conta, A. and S. Deering, "Generic Packet Tunneling in 594 IPv6 Specification", RFC 2473, December 1998. 596 [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, 597 "Definition of the Differentiated Services Field (DS 598 Field) in the IPv4 and IPv6 Headers", RFC 2474, 599 December 1998. 601 [RFC4213] Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms 602 for IPv6 Hosts and Routers", RFC 4213, October 2005. 604 [RFC5625] Bellis, R., "DNS Proxy Implementation Guidelines", 605 BCP 152, RFC 5625, August 2009. 607 12.2. Informative references 609 [I-D.droms-softwires-snat] 610 Droms, R. and B. Haberman, "Softwires Network Address 611 Translation (SNAT)", draft-droms-softwires-snat-01 (work 612 in progress), July 2008. 614 [I-D.durand-dual-stack-lite] 615 Durand, A., "Dual-stack lite broadband deployments post 616 IPv4 exhaustion", draft-durand-dual-stack-lite-00 (work in 617 progress), July 2008. 619 [I-D.ietf-behave-lsn-requirements] 620 Perreault, S., Yamagata, I., Miyakawa, S., Nakagawa, A., 621 and H. Ashida, "Common requirements for IP address sharing 622 schemes", draft-ietf-behave-lsn-requirements-01 (work in 623 progress), March 2011. 625 [I-D.ietf-intarea-server-logging-recommendations] 626 Durand, A., Gashinsky, I., Lee, D., and S. Sheppard, 627 "Logging recommendations for Internet facing servers", 628 draft-ietf-intarea-server-logging-recommendations-04 (work 629 in progress), April 2011. 631 [I-D.ietf-intarea-shared-addressing-issues] 632 Ford, M., Boucadair, M., Durand, A., Levis, P., and P. 633 Roberts, "Issues with IP Address Sharing", 634 draft-ietf-intarea-shared-addressing-issues-05 (work in 635 progress), March 2011. 637 [I-D.ietf-pcp-base] 638 Wing, D., Cheshire, S., Boucadair, M., and R. Penno, "Port 639 Control Protocol (PCP)", draft-ietf-pcp-base-10 (work in 640 progress), April 2011. 642 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 643 November 1990. 645 [RFC1918] Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G., and 646 E. Lear, "Address Allocation for Private Internets", 647 BCP 5, RFC 1918, February 1996. 649 [RFC2663] Srisuresh, P. and M. Holdrege, "IP Network Address 650 Translator (NAT) Terminology and Considerations", 651 RFC 2663, August 1999. 653 [RFC2993] Hain, T., "Architectural Implications of NAT", RFC 2993, 654 November 2000. 656 [RFC4033] Arends, R., Austein, R., Larson, M., Massey, D., and S. 657 Rose, "DNS Security Introduction and Requirements", 658 RFC 4033, March 2005. 660 [RFC4787] Audet, F. and C. Jennings, "Network Address Translation 661 (NAT) Behavioral Requirements for Unicast UDP", BCP 127, 662 RFC 4787, January 2007. 664 [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common 665 Mitigations", RFC 4987, August 2007. 667 [RFC5320] Templin, F., "The Subnetwork Encapsulation and Adaptation 668 Layer (SEAL)", RFC 5320, February 2010. 670 [RFC5382] Guha, S., Biswas, K., Ford, B., Sivakumar, S., and P. 671 Srisuresh, "NAT Behavioral Requirements for TCP", BCP 142, 672 RFC 5382, October 2008. 674 [RFC5508] Srisuresh, P., Ford, B., Sivakumar, S., and S. Guha, "NAT 675 Behavioral Requirements for ICMP", BCP 148, RFC 5508, 676 April 2009. 678 [RFC5571] Storer, B., Pignataro, C., Dos Santos, M., Stevant, B., 679 Toutain, L., and J. Tremblay, "Softwire Hub and Spoke 680 Deployment Framework with Layer Two Tunneling Protocol 681 Version 2 (L2TPv2)", RFC 5571, June 2009. 683 Appendix A. Deployment considerations 685 A.1. AFTR service distribution and horizontal scaling 687 One of the key benefits of the Dual-Stack Lite technology lies in the 688 fact that it is a tunnel based solution. As such tunnel end-points 689 can be anywhere in the service provider network. 691 Using the DHCPv6 tunnel end-point option 692 [I-D.ietf-softwire-ds-lite-tunnel-option], service providers can 693 create groups of users sharing the same AFTR. Those groups can be 694 merged or divided at will. This leads to an horizontally scaled 695 solution, where more capacity is added simply by adding more AFTRs. 696 As those groups of users can evolve over time, it is best to make 697 sure that AFTRs do not require per-user configuration in order to 698 provide service. 700 A.2. Horizontal scaling 702 A service provider can start using just a few centralized AFTRs. 703 Later, when more capacity is needed, more AFTRs can be added and 704 pushed closer to the edges of the access network. In case of a spike 705 of traffic, for example during the Olympic games or an important 706 political event, capacity can be quickly added in any location of the 707 network (tunnels can terminate anywhere) simply by splitting user 708 groups. Extra capacity can be later removed when the traffic returns 709 to normal and users can be switched back to their original AFTRs by 710 resetting the DHCPv6 tunnel end-point settings. 712 A.3. High availability 714 An important element in the design of the Dual-Stack Lite technology 715 is the simplicity of implementation on the customer side. A simple 716 IP4-in-IPv6 tunnel and a default route over it in the B4 element are 717 all is needed to get IPv4 connectivity. Dealing with high 718 availability is the responsibility of the service provider, not the 719 customer devices implementing Dual-Stack Lite. As such, a single 720 IPv6 address of the tunnel end-point is provided in the DHCPv6 option 721 defined in [I-D.ietf-softwire-ds-lite-tunnel-option]. The service 722 provider can use techniques such as anycast or various types of 723 clusters to ensure availability of the IPv4 service. The exact 724 synchronization (or lack thereof) between redundant AFTRs is out of 725 scope for this document. 727 A.4. Logging 729 DS-Lite AFTR implementation should offer the functionality to log NAT 730 binding creations or other ways to keep track of the ports/IP 731 addresses used by customers. This is both to support 732 troubleshooting, which is very important to service providers trying 733 to figure out why something may not be working, as well as to meet 734 region-specific requirements for responding to legally-binding 735 requests for information from law enforcement authorities. 737 Appendix B. Examples 739 B.1. Gateway based architecture 741 This architecture is targeted at residential broadband deployments 742 but can be adapted easily to other types of deployment where the 743 installed base of IPv4-only devices is important. 745 Consider a scenario where a Dual-Stack Lite CPE is provisioned only 746 with IPv6 in the WAN port, no IPv4. The CPE acts as an IPv4 DHCP 747 server for the LAN network (wireline and wireless) handing out 748 [RFC1918] addresses. In addition, the CPE may support IPv6 Auto- 749 Configuration and/or DHCPv6 server for the LAN network. When an 750 IPv4-only device connects to the CPE, that CPE will hand out a 751 [RFC1918] address to the device. When a dual-stack capable device 752 connects to the CPE, that CPE will hand out a [RFC1918] address and a 753 global IPv6 address to the device. Besides, the CPE will create an 754 IPv4-in-IPv6 softwire tunnel [RFC5571] to an AFTR that resides in the 755 service provider network. 757 When the device accesses IPv6 service, it will send the IPv6 datagram 758 to the CPE natively. The CPE will route the traffic upstream to the 759 IPv6 default gateway. 761 When the device accesses IPv4 service, it will source the IPv4 762 datagram with the [RFC1918] address and send the IPv4 datagram to the 763 CPE. The CPE will encapsulate the IPv4 datagram inside the IPv4-in- 764 IPv6 softwire tunnel and forward the IPv6 datagram to the AFTR. This 765 contrasts what the CPE normally does today, which is, NAT the 766 [RFC1918] address to the public IPv4 address and route the datagram 767 upstream. When the AFTR receives the IPv6 datagram, it will 768 decapsulate the IPv6 header and perform an IPv4-to-IPv4 NAT on the 769 source address. 771 As illustrated in Figure 1, this Dual-Stack Lite deployment model 772 consists of three components: the Dual-Stack Lite home router with a 773 B4 element, the AFTR and a softwire between the B4 element acting as 774 softwire initiator (SI) [RFC5571] in the Dual-Stack Lite home router 775 and the softwire concentrator (SC) [RFC5571] in the AFTR. The AFTR 776 performs IPv4-IPv4 NAT translations to multiplex multiple subscribers 777 through a pool of global IPv4 address. Overlapping address spaces 778 used by subscribers are disambiguated through the identification of 779 tunnel endpoints. 781 +-----------+ 782 | Host | 783 +-----+-----+ 784 |10.0.0.1 785 | 786 | 787 |10.0.0.2 788 +---------|---------+ 789 | | | 790 | Home router | 791 |+--------+--------+| 792 || B4 || 793 |+--------+--------+| 794 +--------|||--------+ 795 |||2001:db8:0:1::1 796 ||| 797 |||<-IPv4-in-IPv6 softwire 798 ||| 799 -------|||------- 800 / ||| \ 801 | ISP core network | 802 \ ||| / 803 -------|||------- 804 ||| 805 |||2001:db8:0:2::1 806 +--------|||--------+ 807 | AFTR | 808 |+--------+--------+| 809 || Concentrator || 810 |+--------+--------+| 811 | |NAT| | 812 | +-+-+ | 813 +---------|---------+ 814 |192.0.2.1 815 | 816 --------|-------- 817 / | \ 818 | Internet | 819 \ | / 820 --------|-------- 821 | 822 |198.51.100.1 823 +-----+-----+ 824 | IPv4 Host | 825 +-----------+ 827 Figure 1: gateway-based architecture 829 Notes: 831 o The Dual-Stack Lite home router is not required to be on the same 832 link as the host 834 o The Dual-Stack Lite home router could be replaced by a Dual-Stack 835 Lite router in the service provider network 837 The resulting solution accepts an IPv4 datagram that is translated 838 into an IPv4-in-IPv6 softwire datagram for transmission across the 839 softwire. At the corresponding endpoint, the IPv4 datagram is 840 decapsulated, and the translated IPv4 address is inserted based on a 841 translation from the softwire. 843 B.1.1. Example message flow 845 In the example shown in Figure 2, the translation tables in the AFTR 846 is configured to forward between IP/TCP (10.0.0.1/10000) and IP/TCP 847 (192.0.2.1/5000). That is a datagram received by the Dual-Stack Lite 848 home router from the host at address 10.0.0.1, using TCP DST port 849 10000 will be translated a datagram with IP SRC address 192.0.2.1 and 850 TCP SRC port 5000 in the Internet. 852 +-----------+ 853 | Host | 854 +-----+-----+ 855 | |10.0.0.1 856 IPv4 datagram 1 | | 857 | | 858 v |10.0.0.2 859 +---------|---------+ 860 | | | 861 | home router | 862 |+--------+--------+| 863 || B4 || 864 |+--------+--------+| 865 +--------|||--------+ 866 | |||2001:db8:0:1::1 867 IPv6 datagram 2| ||| 868 | |||<-IPv4-in-IPv6 softwire 869 | ||| 870 -----|-|||------- 871 / | ||| \ 872 | ISP core network | 873 \ | ||| / 874 -----|-|||------- 875 | ||| 876 | |||2001:db8:0:2::1 877 +------|-|||--------+ 878 | | AFTR | 879 | v ||| | 880 |+--------+--------+| 881 || Concentrator || 882 |+--------+--------+| 883 | |NAT| | 884 | +-+-+ | 885 +---------|---------+ 886 | |192.0.2.1 887 IPv4 datagram 3 | | 888 | | 889 -----|--|-------- 890 / | | \ 891 | Internet | 892 \ | | / 893 -----|--|-------- 894 | | 895 v |198.51.100.1 896 +-----+-----+ 897 | IPv4 Host | 898 +-----------+ 899 Figure 2: Outbound Datagram 901 +-----------------+--------------+-----------------+ 902 | Datagram | Header field | Contents | 903 +-----------------+--------------+-----------------+ 904 | IPv4 datagram 1 | IPv4 Dst | 198.51.100.1 | 905 | | IPv4 Src | 10.0.0.1 | 906 | | TCP Dst | 80 | 907 | | TCP Src | 10000 | 908 | --------------- | ------------ | ------------- | 909 | IPv6 datagram 2 | IPv6 Dst | 2001:db8:0:2::1 | 910 | | IPv6 Src | 2001:db8:0:1::1 | 911 | | IPv4 Dst | 198.51.100.1 | 912 | | IPv4 Src | 10.0.0.1 | 913 | | TCP Dst | 80 | 914 | | TCP Src | 10000 | 915 | --------------- | ------------ | ------------- | 916 | IPv4 datagram 3 | IPv4 Dst | 198.51.100.1 | 917 | | IPv4 Src | 192.0.2.1 | 918 | | TCP Dst | 80 | 919 | | TCP Src | 5000 | 920 +-----------------+--------------+-----------------+ 922 Datagram header contents 924 When datagram 1 is received by the Dual-Stack Lite home router, the 925 B4 element encapsulates the datagram in datagram 2 and forwards it to 926 the Dual-Stack Lite carrier-grade NAT over the softwire. 928 When it receives datagram 2, the tunnel concentrator in the AFTR 929 forwards the IPv4 datagram to the NAT, which determines from its NAT 930 table that the datagram received on the softwire with TCP SRC port 931 10000 should be translated to datagram 3 with IP SRC address 932 192.0.2.1 and TCP SRC port 5000. 934 Figure 3 shows an inbound message received at the AFTR. When the NAT 935 function in the AFTR receives datagram 1, it looks up the IP/TCP DST 936 in its translation table. In the example in Figure 3, the NAT 937 changes the TCP DST port to 10000, sets the IP DST address to 938 10.0.0.1 and forwards the datagram to the softwire. The B4 in the 939 home router decapsulates IPv4 datagram from the inbound softwire 940 datagram, and forwards it to the host. 942 +-----------+ 943 | Host | 944 +-----+-----+ 945 ^ |10.0.0.1 946 IPv4 datagram 3 | | 947 | | 948 | |10.0.0.2 949 +---------|---------+ 950 | +-+-+ | 951 | home router | 952 |+--------+--------+| 953 || B4 || 954 |+--------+--------+| 955 +--------|||--------+ 956 ^ |||2001:db8:0:1::1 957 IPv6 datagram 2 | ||| 958 | |||<-IPv4-in-IPv6 softwire 959 | ||| 960 -----|-|||------- 961 / | ||| \ 962 | ISP core network | 963 \ | ||| / 964 -----|-|||------- 965 | ||| 966 | |||2001:db8:0:2::1 967 +------|-|||--------+ 968 | AFTR | 969 |+--------+--------+| 970 || Concentrator || 971 |+--------+--------+| 972 | |NAT| | 973 | +-+-+ | 974 +---------|---------+ 975 ^ |192.0.2.1 976 IPv4 datagram 1 | | 977 | | 978 -----|--|-------- 979 / | | \ 980 | Internet | 981 \ | | / 982 -----|--|-------- 983 | | 984 | |198.51.100.1 985 +-----+-----+ 986 | IPv4 Host | 987 +-----------+ 989 Figure 3: Inbound Datagram 991 +-----------------+--------------+-----------------+ 992 | Datagram | Header field | Contents | 993 +-----------------+--------------+-----------------+ 994 | IPv4 datagram 1 | IPv4 Dst | 192.0.2.1 | 995 | | IPv4 Src | 198.51.100.1 | 996 | | TCP Dst | 5000 | 997 | | TCP Src | 80 | 998 | --------------- | ------------ | ------------- | 999 | IPv6 datagram 2 | IPv6 Dst | 2001:db8:0:1::1 | 1000 | | IPv6 Src | 2001:db8:0:2::1 | 1001 | | IPv4 Dst | 10.0.0.1 | 1002 | | IP Src | 198.51.100.1 | 1003 | | TCP Dst | 10000 | 1004 | | TCP Src | 80 | 1005 | --------------- | ------------ | ------------- | 1006 | IPv4 datagram 3 | IPv4 Dst | 10.0.0.1 | 1007 | | IPv4 Src | 198.51.100.1 | 1008 | | TCP Dst | 10000 | 1009 | | TCP Src | 80 | 1010 +-----------------+--------------+-----------------+ 1012 Datagram header contents 1014 B.1.2. Translation details 1016 The AFTR has a NAT that translates between softwire/port pairs and 1017 IPv4-address/port pairs. The same translation is applied to IPv4 1018 datagrams received on the device's external interface and from the 1019 softwire endpoint in the device. 1021 In Figure 2, the translator network interface in the AFTR is on the 1022 Internet, and the softwire interface connects to the Dual-Stack Lite 1023 home router. The AFTR translator is configured as follows: 1025 Network interface: Translate IPv4 destination address and TCP 1026 destination port to the softwire identifier and TCP destination 1027 port 1029 Softwire interface: Translate softwire identifier and TCP source 1030 port to IPv4 source address and TCP source port 1032 Here is how the translation in Figure 3 works: 1034 o Datagram 1 is received on the AFTR translator network interface. 1035 The translator looks up the IPv4-address/port pair in its 1036 translator table, rewrites the IPv4 destination address to 1037 10.0.0.1 and the TCP source port to 10000, and forwards the 1038 datagram to the softwire. 1040 o The IPv4 datagram is received on the Dual-Stack Lite home router 1041 B4. The B4 function extracts the IPv4 datagram and the Dual-Stack 1042 Lite home router forwards datagram 3 to the host. 1044 +------------------------------------+--------------------+ 1045 | Softwire-Id/IPv4/Prot/Port | IPv4/Prot/Port | 1046 +------------------------------------+--------------------+ 1047 | 2001:db8:0:1::1/10.0.0.1/TCP/10000 | 192.0.2.1/TCP/5000 | 1048 +------------------------------------+--------------------+ 1050 Dual-Stack Lite carrier-grade NAT translation table 1052 The Softwire-Id is the IPv6 address assigned to the Dual-Stack Lite 1053 CPE. Hosts behind the same Dual-Stack Lite home router have the same 1054 Softwire-Id. The source IPv4 is the [RFC1918] addressed assigned by 1055 the Dual-Stack home router which is unique to each host behind the 1056 CPE. The AFTR would receive packets sourced from different IPv4 1057 addresses in the same softwire tunnel. The AFTR combines the 1058 Softwire-Id and IPv4 address/Port [Softwire-Id, IPv4+Port] to 1059 uniquely identify the host behind the same Dual-Stack Lite home 1060 router. 1062 B.2. Host based architecture 1064 This architecture is targeted at new, large scale deployments of 1065 dual-stack capable devices implementing a Dual-Stack Lite interface. 1067 Consider a scenario where a Dual-Stack Lite host device is directly 1068 connected to the service provider network. The host device is dual- 1069 stack capable but only provisioned an IPv6 global address. Besides, 1070 the host device will pre-configure a well-known IPv4 non-routable 1071 address (see IANA section). This well-known IPv4 non-routable 1072 address is similar to the 127.0.0.1 loopback address. Every host 1073 device implemented Dual-Stack Lite will pre-configure the same 1074 address. This address will be used to source the IPv4 datagram when 1075 the device accesses IPv4 services. Besides, the host device will 1076 create an IPv4-in-IPv6 softwire tunnel to an AFTR. The carrier-grade 1077 NAT will reside in the service provider network. 1079 When the device accesses IPv6 service, the device will send the IPv6 1080 datagram natively to the default gateway. 1082 When the device accesses IPv4 service, it will source the IPv4 1083 datagram with the well-known non-routable IPv4 address. Then, the 1084 host device will encapsulate the IPv4 datagram inside the IPv4-in- 1085 IPv6 softwire tunnel and send the IPv6 datagram to the AFTR. When 1086 the AFTR receives the IPv6 datagram, it will decapsulate the IPv6 1087 header and perform IPv4-to-IPv4 NAT on the source address. 1089 This scenario works on both wireline and wireless networks. A 1090 typical wireless device will connect directly to the service provider 1091 without CPE in between. 1093 As illustrated in Figure 4, this Dual-Stack Lite deployment model 1094 consists of three components: the Dual-Stack Lite host, the AFTR and 1095 a softwire between the softwire initiator B4 in the host and the 1096 softwire concentrator in the AFTR. The Dual-Stack Lite host is 1097 assumed to have IPv6 service and can exchange IPv6 traffic with the 1098 AFTR. 1100 The AFTR performs IPv4-IPv4 NAT translations to multiplex multiple 1101 subscribers through a pool of global IPv4 address. Overlapping IPv4 1102 address spaces used by the Dual-Stack Lite hosts are disambiguated 1103 through the identification of tunnel endpoints. 1105 In this situation, the Dual-Stack Lite host configures the IPv4 1106 address 192.0.0.2 out of the well-known range 192.0.0.0/29 (defined 1107 by IANA) on its B4 interface. It also configure the first non- 1108 reserved IPv4 address of the reserved range, 192.0.0.1 as the address 1109 of its default gateway. 1111 +-------------------+ 1112 | | 1113 | Host 192.0.0.2 | 1114 |+--------+--------+| 1115 || B4 || 1116 |+--------+--------+| 1117 +--------|||--------+ 1118 |||2001:db8:0:1::1 1119 ||| 1120 |||<-IPv4-in-IPv6 softwire 1121 ||| 1122 -------|||------- 1123 / ||| \ 1124 | ISP core network | 1125 \ ||| / 1126 -------|||------- 1127 ||| 1128 |||2001:db8:0:2::1 1129 +--------|||--------+ 1130 | AFTR | 1131 |+--------+--------+| 1132 || Concentrator || 1133 |+--------+--------+| 1134 | |NAT| | 1135 | +-+-+ | 1136 +---------|---------+ 1137 |192.0.2.1 1138 | 1139 --------|-------- 1140 / | \ 1141 | Internet | 1142 \ | / 1143 --------|-------- 1144 | 1145 |198.51.100.1 1146 +-----+-----+ 1147 | IPv4 Host | 1148 +-----------+ 1150 Figure 4: host-based architecture 1152 The resulting solution accepts an IPv4 datagram that is translated 1153 into an IPv4-in-IPv6 softwire datagram for transmission across the 1154 softwire. At the corresponding endpoint, the IPv4 datagram is 1155 decapsulated, and the translated IPv4 address is inserted based on a 1156 translation from the softwire. 1158 B.2.1. Example message flow 1160 In the example shown in Figure 5, the translation tables in the AFTR 1161 is configured to forward between IP/TCP (192.0.0.2/10000) and IP/TCP 1162 (192.0.2.1/5000). That is, a datagram received from the host at 1163 address 192.0.0.2, using TCP DST port 10000 will be translated a 1164 datagram with IP SRC address 192.0.2.1 and TCP SRC port 5000 in the 1165 Internet. 1167 +-------------------+ 1168 | | 1169 |Host 192.0.0.2 | 1170 |+--------+--------+| 1171 || B4 || 1172 |+--------+--------+| 1173 +--------|||--------+ 1174 | |||2001:db8:0:1::1 1175 IPv6 datagram 1| ||| 1176 | |||<-IPv4-in-IPv6 softwire 1177 | ||| 1178 -----|-|||------- 1179 / | ||| \ 1180 | ISP core network | 1181 \ | ||| / 1182 -----|-|||------- 1183 | ||| 1184 | |||2001:db8:0:2::1 1185 +------|-|||--------+ 1186 | | AFTR | 1187 | v ||| | 1188 |+--------+--------+| 1189 || Concentrator || 1190 |+--------+--------+| 1191 | |NAT| | 1192 | +-+-+ | 1193 +---------|---------+ 1194 | |192.0.2.1 1195 IPv4 datagram 2 | | 1196 -----|--|-------- 1197 / | | \ 1198 | Internet | 1199 \ | | / 1200 -----|--|-------- 1201 | | 1202 v |198.51.100.1 1203 +-----+-----+ 1204 | IPv4 Host | 1205 +-----------+ 1207 Figure 5: Outbound Datagram 1209 +-----------------+--------------+-----------------+ 1210 | Datagram | Header field | Contents | 1211 +-----------------+--------------+-----------------+ 1212 | IPv6 datagram 1 | IPv6 Dst | 2001:db8:0:2::1 | 1213 | | IPv6 Src | 2001:db8:0:1::1 | 1214 | | IPv4 Dst | 198.51.100.1 | 1215 | | IPv4 Src | 192.0.0.2 | 1216 | | TCP Dst | 80 | 1217 | | TCP Src | 10000 | 1218 | --------------- | ------------ | ------------- | 1219 | IPv4 datagram 2 | IPv4 Dst | 198.51.100.1 | 1220 | | IPv4 Src | 192.0.2.1 | 1221 | | TCP Dst | 80 | 1222 | | TCP Src | 5000 | 1223 +-----------------+--------------+-----------------+ 1225 Datagram header contents 1227 When sending an IPv4 packet, the Dual-Stack Lite host encapsulates it 1228 in datagram 1 and forwards it to the AFTR over the softwire. 1230 When it receives datagram 1, the concentrator in the AFTR hands the 1231 IPv4 datagram to the NAT, which determines from its translation table 1232 that the datagram received on the softwire with TCP SRC port 10000 1233 should be translated to datagram 3 with IP SRC address 192.0.2.1 and 1234 TCP SRC port 5000. 1236 Figure 6 shows an inbound message received at the AFTR. When the NAT 1237 function in the AFTR receives datagram 1, it looks up the IP/TCP DST 1238 in its translation table. In the example in Figure 3, the NAT 1239 translates the TCP DST port to 10000, sets the IP DST address to 1240 192.0.0.2 and forwards the datagram to the softwire. The B4 in the 1241 Dual-Stack Lite hosts decapsulates IPv4 datagram from the inbound 1242 softwire datagram, and forwards it to the host. 1244 +-------------------+ 1245 | | 1246 |Host 192.0.0.2 | 1247 |+--------+--------+| 1248 || B4 || 1249 |+--------+--------+| 1250 +--------|||--------+ 1251 ^ |||2001:db8:0:1::1 1252 IPv6 datagram 2 | ||| 1253 | |||<-IPv4-in-IPv6 softwire 1254 | ||| 1255 -----|-|||------- 1256 / | ||| \ 1257 | ISP core network | 1258 \ | ||| / 1259 -----|-|||------- 1260 | ||| 1261 | |||2001:db8:0:2::1 1262 +------|-|||--------+ 1263 | AFTR | 1264 | | ||| | 1265 |+--------+--------+| 1266 || Concentrator || 1267 |+--------+--------+| 1268 | |NAT| | 1269 | +-+-+ | 1270 +---------|---------+ 1271 ^ |192.0.2.1 1272 IPv4 datagram 1 | | 1273 -----|--|-------- 1274 / | | \ 1275 | Internet | 1276 \ | | / 1277 -----|--|-------- 1278 | | 1279 | |198.51.100.1 1280 +-----+-----+ 1281 | IPv4 Host | 1282 +-----------+ 1284 Figure 6: Inbound Datagram 1286 +-----------------+--------------+-----------------+ 1287 | Datagram | Header field | Contents | 1288 +-----------------+--------------+-----------------+ 1289 | IPv4 datagram 1 | IPv4 Dst | 192.0.2.1 | 1290 | | IPv4 Src | 198.51.100.1 | 1291 | | TCP Dst | 5000 | 1292 | | TCP Src | 80 | 1293 | --------------- | ------------ | ------------- | 1294 | IPv6 datagram 2 | IPv6 Dst | 2001:db8:0:1::1 | 1295 | | IPv6 Src | 2001:db8:0:2::1 | 1296 | | IPv4 Dst | 192.0.0.2 | 1297 | | IP Src | 198.51.100.1 | 1298 | | TCP Dst | 10000 | 1299 | | TCP Src | 80 | 1300 +-----------------+--------------+-----------------+ 1302 Datagram header contents 1304 B.2.2. Translation details 1306 The translations happening in the AFTR are the same as in the 1307 previous examples. The well known IPv4 address 192.0.0.2 out of the 1308 192.0.0.0/29 (defined by IANA) range used by all the hosts are 1309 disambiguated by the IPv6 source address of the softwire. 1311 +-------------------------------------+--------------------+ 1312 | Softwire-Id/IPv4/Prot/Port | IPv4/Prot/Port | 1313 +-------------------------------------+--------------------+ 1314 | 2001:db8:0:1::1/192.0.0.2/TCP/10000 | 192.0.2.1/TCP/5000 | 1315 +-------------------------------------+--------------------+ 1317 Dual-Stack Lite carrier-grade NAT translation table 1319 The Softwire-Id is the IPv6 address assigned to the Dual-Stack host. 1320 Each host has an unique Softwire-Id. The source IPv4 address is one 1321 of the well-known IPv4 address. The AFTR could receive packets from 1322 different hosts sourced from the same IPv4 well-known address from 1323 different softwire tunnels. Similar to the gateway architecture, the 1324 AFTR combines the Softwire-Id and IPv4 address/Port [Softwire-Id, 1325 IPv4+Port] to uniquely identify the individual host. 1327 Authors' Addresses 1329 Alain Durand 1330 Juniper Networks 1331 1194 North Mathilda Avenue 1332 Sunnyvale, CA 94089-1206 1333 USA 1335 Email: adurand@juniper.net 1337 Ralph Droms 1338 Cisco 1339 1414 Massachusetts Avenue 1340 Boxborough, MA 01714 1341 USA 1343 Email: rdroms@cisco.com 1345 James Woodyatt 1346 Apple 1347 1 Infinite Loop 1348 Cupertino, CA 95014 1349 USA 1351 Email: jhw@apple.com 1353 Yiu L. Lee 1354 Comcast 1355 One Comcast Center 1356 Philadelphia, PA 19103 1357 USA 1359 Email: yiu_lee@cable.comcast.com