idnits 2.17.1 draft-ietf-softwire-dual-stack-lite-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 15 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 8 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 10, 2010) is 5032 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-10) exists of draft-ietf-softwire-ds-lite-tunnel-option-03 == Outdated reference: A later version (-07) exists of draft-cheshire-nat-pmp-03 == Outdated reference: A later version (-05) exists of draft-nishitani-cgn-04 == Outdated reference: A later version (-10) exists of draft-ymbk-aplusp-05 Summary: 0 errors (**), 0 flaws (~~), 7 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force A. Durand, Ed. 3 Internet-Draft Juniper Networks 4 Intended status: Standards Track July 10, 2010 5 Expires: January 11, 2011 7 Dual-Stack Lite Broadband Deployments Following IPv4 Exhaustion 8 draft-ietf-softwire-dual-stack-lite-05 10 Abstract 12 This document revisits the dual-stack model and introduces the dual- 13 stack lite technology aimed at better aligning the costs and benefits 14 of deploying IPv6. Dual-stack lite enables a broadband service 15 provider to share IPv4 addresses among customers by combining two 16 well-known technologies: IP in IP (IPv4-in-IPv6) and Network Address 17 Translation (NAT). 19 Status of this Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at http://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on January 11, 2011. 36 Copyright Notice 38 Copyright (c) 2010 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 54 2. Requirements language . . . . . . . . . . . . . . . . . . . . 4 55 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 56 4. Deployment scenarios . . . . . . . . . . . . . . . . . . . . . 5 57 4.1. Access model . . . . . . . . . . . . . . . . . . . . . . . 5 58 4.2. Home gateway . . . . . . . . . . . . . . . . . . . . . . . 5 59 4.3. Directly connected device . . . . . . . . . . . . . . . . 6 60 5. B4 element . . . . . . . . . . . . . . . . . . . . . . . . . . 7 61 5.1. Definition . . . . . . . . . . . . . . . . . . . . . . . . 7 62 5.2. Encapsulation . . . . . . . . . . . . . . . . . . . . . . 7 63 5.3. Fragmentation and Reassembly . . . . . . . . . . . . . . . 7 64 5.4. AFTR discovery . . . . . . . . . . . . . . . . . . . . . . 7 65 5.5. DNS . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 66 5.6. Interface initialization . . . . . . . . . . . . . . . . . 8 67 5.7. Well-known IPv4 address . . . . . . . . . . . . . . . . . 8 68 6. AFTR element . . . . . . . . . . . . . . . . . . . . . . . . . 8 69 6.1. Definition . . . . . . . . . . . . . . . . . . . . . . . . 8 70 6.2. Encapsulation . . . . . . . . . . . . . . . . . . . . . . 8 71 6.3. Fragmentation and Reassembly . . . . . . . . . . . . . . . 9 72 6.4. DNS . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 73 6.5. Well-known IPv4 address . . . . . . . . . . . . . . . . . 9 74 6.6. Extended binding table . . . . . . . . . . . . . . . . . . 10 75 7. Network Considerations . . . . . . . . . . . . . . . . . . . . 10 76 7.1. Tunneling . . . . . . . . . . . . . . . . . . . . . . . . 10 77 7.2. VPN . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 78 7.3. Multicast considerations . . . . . . . . . . . . . . . . . 10 79 8. NAT considerations . . . . . . . . . . . . . . . . . . . . . . 10 80 8.1. NAT pool . . . . . . . . . . . . . . . . . . . . . . . . . 10 81 8.2. NAT conformance . . . . . . . . . . . . . . . . . . . . . 10 82 8.3. Application Level Gateways (ALG) . . . . . . . . . . . . . 11 83 8.4. Port allocation . . . . . . . . . . . . . . . . . . . . . 11 84 8.4.1. How many ports per customers? . . . . . . . . . . . . 11 85 8.4.2. Dynamic port assignment considerations . . . . . . . . 12 86 8.4.3. Subscriber controlled port assignment . . . . . . . . 12 87 8.5. Other considerations about sharing global IPv4 88 addresses . . . . . . . . . . . . . . . . . . . . . . . . 12 89 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 12 90 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 91 11. Security Considerations . . . . . . . . . . . . . . . . . . . 13 92 12. Author's Addresses . . . . . . . . . . . . . . . . . . . . . . 14 93 13. Appendix A: Deployment considerations . . . . . . . . . . . . 15 94 13.1. AFTR service distribution and horizontal scaling . . . . . 15 95 13.2. Horizontal scaling . . . . . . . . . . . . . . . . . . . . 15 96 13.3. High availability . . . . . . . . . . . . . . . . . . . . 15 97 13.4. Logging . . . . . . . . . . . . . . . . . . . . . . . . . 16 98 14. Appendix B: Examples . . . . . . . . . . . . . . . . . . . . . 16 99 14.1. Gateway based architecture . . . . . . . . . . . . . . . . 16 100 14.1.1. Example message flow . . . . . . . . . . . . . . . . . 19 101 14.1.2. Translation details . . . . . . . . . . . . . . . . . 23 102 14.2. Host based architecture . . . . . . . . . . . . . . . . . 24 103 14.2.1. Example message flow . . . . . . . . . . . . . . . . . 27 104 14.2.2. Translation details . . . . . . . . . . . . . . . . . 31 105 15. Appendix C: Related DS-Lite work on port management . . . . . 31 106 16. References . . . . . . . . . . . . . . . . . . . . . . . . . . 32 107 16.1. Normative references . . . . . . . . . . . . . . . . . . . 32 108 16.2. Informative references . . . . . . . . . . . . . . . . . . 33 109 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 34 111 1. Introduction 113 The common thinking for more than 10 years has been that the 114 transition to IPv6 will be based on the dual stack model and that 115 most things would be converted this way before we ran out of IPv4. 117 However, this has not happened. The IANA free pool of IPv4 addresses 118 will be depleted soon, well before any significant IPv6 deployment 119 will have occurred. 121 This document revisits the dual-stack model and introduces the dual- 122 stack lite technology aimed at better aligning the costs and benefits 123 of deploying IPv6. Dual-stack lite will provide the necessary bridge 124 between the two protocols, offering an evolution path of the Internet 125 post IANA IPv4 depletion. 127 Dual-stack lite enables a broadband service provider to share IPv4 128 addresses among customers by combining two well-known technologies: 129 IP in IP (IPv4-in-IPv6) and NAT. 131 This document makes a distinction between a dual-stack capable and a 132 dual-stack provisioned device. The former is a device that has code 133 that implements both IPv4 and IPv6, from the network layer to the 134 applications. The latter is a similar device that has been 135 provisioned with both an IPv4 and an IPv6 address on its 136 interface(s). This document will also further refine this notion by 137 distinguishing between interfaces provisioned directly by the service 138 provider from those provisioned by the customer. 140 Pure IPv6-only devices (i.e. devices that do not include an IPv4 141 stack) are outside of the scope of this document. 143 2. Requirements language 145 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 146 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 147 document are to be interpreted as described in RFC 2119 [RFC2119]. 149 3. Terminology 151 The technology described in this document is known as dual-stack 152 lite. The abbreviation DS-Lite will be used along this text. 154 This document also introduces two new terms: the DS-Lite Basic 155 Bridging BroadBand element (B4) and the DS-Lite Address Family 156 Transition Router element (AFTR). 158 4. Deployment scenarios 160 4.1. Access model 162 Instead of relying on a cascade of NATs, the dual-stack lite model is 163 built on IPv4-in-IPv6 tunnels to cross the network to reach a 164 carrier-grade IPv4-IPv4 NAT (the AFTR) where customers will share 165 IPv4 addresses. There are numbers of benefits to this approach: 167 o This technology decouples the deployment of IPv6 in the service 168 provider network (up to the customer premise equipment or CPE) 169 from the deployment of IPv6 in the global Internet and in customer 170 applications & devices. 172 o The management of the service provider access networks is 173 simplified by leveraging the large IPv6 address space. 174 Overlapping private IPv4 address spaces are not required to 175 support very large customer bases. 177 o As tunnels can terminate anywhere in the service provider network, 178 this architecture leads itself to horizontal scaling and provides 179 great flexibility to adapt to changing traffic load. 181 o Tunnels provide a direct connection between B4 and the AFTR. This 182 can be leveraged to enable customers and their applications to 183 control how the NAT function of the AFTR is performed. 185 A key characteristic of this approach is that communications between 186 end-nodes stay within their address family. IPv6 sources only 187 communicate with IPv6 destinations, IPv4 sources only communicate 188 with IPv4 destinations. There is no protocol family translation 189 involved in this approach. This simplifies greatly the task of 190 applications that may carry literal IP addresses in their payload. 192 4.2. Home gateway 194 This section describes home Local Area networks characterized by the 195 presence of a home gateway provisioned only with IPv6 by the service 196 provider. 198 A DS-Lite home gateway is an IPv6 aware home gateway with a B4 199 Interface implemented in the WAN interface. 201 A DS-Lite home gateway SHOULD NOT operate a NAT function on a B4 202 interface, as the NAT function will be performed by the AFTR in the 203 service provider's network. That will avoid accidentally operating 204 in a double NAT environment. 206 However, it SHOULD operate its own DHCP(v4) server handing out 207 [RFC1918] address space (e.g. 192.168.0.0/16) to hosts in the home. 208 It SHOULD advertise itself as the default IPv4 router to those home 209 hosts. It SHOULD also advertise itself as a DNS server in the DHCP 210 Option 6 (DNS Server). Additionally, it SHOULD operate a DNS proxy 211 to accept DNS IPv4 requests from home hosts and send them using IPv6 212 to the service provider DNS servers, as described in Section 5.5. 214 Note: if an IPv4 home host decides to use another IPv4 DNS server, 215 the DS-Lite home gateway will forward those DNS requests via the B4 216 interface, the same way it forwards any regular IPv4 packets. 217 However, each DNS request will create a binding in the AFTR. A large 218 number of DNS requests may have direct impact to the AFTR's NAT table 219 utilization. 221 IPv6 capable devices directly reach the IPv6 Internet. Packets 222 simply follow IPv6 routing, they do not go through the tunnel, and 223 are not subject to any translation. It is expected that most IPv6 224 capable devices will also be IPv4 capable and will simply be 225 configured with an IPv4 RFC1918 style address within the home network 226 and access the IPv4 Internet the same way as the legacy IPv4-only 227 devices within the home. 229 Pure IPv6-only devices (i.e. devices that do not include an IPv4 230 stack) are outside of the scope of this document. 232 4.3. Directly connected device 234 In broadband home networks, sometime devices are directly connected 235 to the broadband service provider. They are connected straight to a 236 modem, without a home gateway. 238 Under this scenario, the customer device is a dual-stack capable host 239 that is only provisioned by the service provider only with IPv6. The 240 device itself acts as a B4 element and the IPv4 service is provided 241 by an IPv4-in-IPv6 tunnel, just as in the home gateway case. That 242 device can run any combinations of IPv4 and/or IPv6 applications. 244 A directly connected DS-Lite device SHOULD send its DNS requests over 245 IPv6 to the IPv6 DNS server it has been configured to use. 247 Similarly to the previous sections, IPv6 packets follow IPv6 routing, 248 they do not go through the tunnel, and are not subject to any 249 translation. 251 The support of IPv4-only devices and IPv6-only devices in this 252 scenario is out of scope for this document. 254 5. B4 element 256 5.1. Definition 258 The B4 element is a function implemented on a dual-stack capable 259 node, either a directly connected device or a home gateway, that 260 creates a tunnel to an AFTR. 262 5.2. Encapsulation 264 The tunnel is a multi-point to point IPv4-in-IPv6 tunnel ending on a 265 service provider AFTR. 267 See section 7.1 for additional tunneling considerations. 269 Note: at this point, DS-Lite only defines IPv4-in-IPv6 tunnels, 270 however other types of encapsulation could be defined in the future. 272 5.3. Fragmentation and Reassembly 274 Using an encapsulation (IPv4-in-IPv6 or anything else) to carry IPv4 275 traffic over IPv6 will reduce the effective MTU of the datagram. 276 Unfortunately, path MTU discovery [RFC1191] is not a reliable method 277 to deal with this problem. 279 A solution to deal with this problem is for the service provider to 280 increase the MTU size of all the links between the B4 element and the 281 AFTR elements by at least 40 bytes to accommodate both the IPv6 282 encapsulation header and the IPv4 datagram without fragmenting the 283 IPv6 packet. 285 However, as not all service providers will be able to increase their 286 link MTU, the B4 element MUST perform fragmentation and reassembly if 287 the outgoing link MTU cannot accommodate for the extra IPv6 header. 288 Fragmentation MUST happen after the encapsulation on the IPv6 packet. 289 Reassembly MUST happen before the decapsulation of the IPv6 header. 290 Detailed procedure has been specified in [RFC2473] Section 7.2. 292 5.4. AFTR discovery 294 In order to configure the IPv4-in-IPv6 tunnel, the B4 element needs 295 the IPv6 address of the AFTR element. This IPv6 address can be 296 configured using a variety of methods, ranging from an out-of-band 297 mechanism, manual configuration or a variety of DHCPv6 options. 299 In order to guarantee interoperability, a B4 element SHOULD implement 300 the DHCPv6 option defined in 301 [I-D.ietf-softwire-ds-lite-tunnel-option]. 303 5.5. DNS 305 A B4 element is only configured from the service provider with IPv6. 306 As such, it can only learn the address of a DNS recursive server 307 through DHCPv6 (or other similar method over IPv6). As DHCPv6 only 308 defines an option to get the IPv6 address of such a DNS recursive 309 server, the B4 element cannot easily discover the IPv4 address of 310 such a recursive DNS server, and as such will have to perform all DNS 311 resolution over IPv6. 313 The B4 element can pass this IPv6 address to downstream IPv6 nodes, 314 but not to downstream IPv4 nodes. As such, the B4 element SHOULD 315 implement a DNS proxy, following the recommendations of [RFC5625]. 317 5.6. Interface initialization 319 Initialization of the interface including a B4 element is out-of- 320 scope in this specification. 322 5.7. Well-known IPv4 address 324 Any locally unique IPv4 address could be configured on the IPv4-in- 325 IPv6 tunnel to represent the B4 element. Configuring such an address 326 is often necessary when the B4 element is sourcing IPv4 datagrams 327 directly over the tunnel. In order to avoid conflicts with any other 328 address, IANA has defined a well-known range, 192.0.0.0/29. 330 192.0.0.0 is the reserved subnet address. 192.0.0.1 is reserved for 331 the AFTR element. The B4 element MAY use any other addresses within 332 the 192.0.0.0/29 range. 334 Note: a range of addresses has been reserved for this purpose. The 335 intent is to accommodate nodes implementing multiple B4 elements. 337 6. AFTR element 339 6.1. Definition 341 An AFTR element is the combination of an IPv4-in-IPv6 tunnel end- 342 point and an IPv4-IPv4 NAT implemented on the same node. 344 6.2. Encapsulation 346 The tunnel is a point-to-multipoint IPv4-in-IPv6 tunnel ending at the 347 B4 elements. 349 See section 7.1 for additional tunneling considerations. 351 Note: at this point, DS-Lite only defines IPv4-in-IPv6 tunnels, 352 however other types of encapsulation could be defined in the future. 354 6.3. Fragmentation and Reassembly 356 As noted previously, fragmentation and reassembly need to be taken 357 care of by the tunnel end-points. As such, the AFTR MUST perform 358 fragmentation and reassembly if the underlying link MTU cannot 359 accommodate the extra IPv6 header of the tunnel. Fragmentation MUST 360 happen after the encapsulation on the IPv6 packet. Reassembly MUST 361 happen before the decapsulation of the IPv6 header. Detailed 362 procedure has been specified in [RFC2473] Section 7.2. 364 Fragmentation at the Tunnel Entry-Point is a light-weight operation. 365 In contrast, reassembly at the Tunnel Exit-Point can be expensive. 366 When the Tunnel Exit-Point receives the first fragmented packet, it 367 must wait for the second fragmented packet to arrive in order to 368 reassemble the two fragmented IPv6 packets for decapsulation. This 369 requires the Tunnel Exit-Point to buffer and keep track of fragmented 370 packets. Consider that the AFTR is the Tunnel Exit-Point for many 371 tunnels. If many clients simultaneously source large number of 372 fragmented packets to the AFTR, this will require the AFTR to buffer 373 and consume enormous resources to keep track of the flows. This 374 reassembly process will significantly impact the AFTR performance. 375 However, this impact only happens when many clients simultaneously 376 source large IPv4 packets. Since we believe that majority of the 377 clients will receive large IPv4 packets (such as watching video 378 streams) instead of sourcing large IPv4 packets (such as sourcing 379 video streams), so reassembly is only a fraction of the overall 380 AFTR's workload. 382 Methods to avoid fragmentation, such as rewriting the TCP MSS option 383 or using technologies such as Subnetwork Encapsulation and Adaptation 384 Layer defined in [I-D.templin-seal] are out of scope for this 385 document. 387 6.4. DNS 389 As noted previously, DS-Lite node implementing a B4 elements will 390 perform DNS resolution over IPv6. As such, very few, if any, DNS 391 packets will flow through the AFTR element. 393 6.5. Well-known IPv4 address 395 The AFTR MAY use the well-known IPv4 address 192.0.0.1 reserved by 396 IANA to configure the IPv4-in-IPv6 tunnel. That address can then be 397 used to report ICMP problems and will appear in traceroute outputs. 399 6.6. Extended binding table 401 The NAT binding table of the AFTR element is extended to include the 402 source IPv6 address of the incoming packets. This IPv6 address is 403 used to disambiguate between the overlapping IPv4 address space of 404 the service provider customers. 406 By doing a reverse look-up in the extended IPv4 NAT binding table, 407 the AFTR knows how to reconstruct the IPv6 encapsulation when the 408 packets comes back from the Internet. That way, there is no need to 409 keep a static configuration for each tunnel. 411 7. Network Considerations 413 7.1. Tunneling 415 Tunneling MUST be done in accordance to [RFC2473] and [RFC4213]. 416 Traffic classes ([RFC2474]) from the IPv4 headers SHOULD be carried 417 over to the IPv6 headers and vice versa. 419 7.2. VPN 421 Dual-stack lite implementations SHOULD NOT interfere with the 422 functioning of IPv4 or IPv6 VPNs. 424 7.3. Multicast considerations 426 Multicast is out-of-scope in this document. 428 8. NAT considerations 430 8.1. NAT pool 432 AFTRs MAY operate distinct, non overlapping NAT pools. Those NAT 433 pools do not have to be continuous. 435 8.2. NAT conformance 437 A dual-stack lite AFTR SHOULD implement behavior conforming to the 438 best current practice, currently documented in [RFC4787], [RFC5382] 439 and [RFC5508]. Other requirements for AFTRs can be found in 440 [I-D.nishitani-cgn]. 442 8.3. Application Level Gateways (ALG) 444 The AFTR should only perform a minimum number of ALG for the classic 445 applications such as FTP, RTSP/RTP, IPsec and PPTP VPN pass-through 446 and enable the users to use their own ALG on statically or 447 dynamically reserved ports instead. 449 8.4. Port allocation 451 8.4.1. How many ports per customers? 453 Because IPv4 addresses will be shared among customers and potentially 454 a large address space reduction factor may be applied, in average, 455 only a limited number N of TCP or UDP port numbers will be available 456 per customer. This means that applications opening a very large 457 number of TCP ports may have a harder time to work. For example, it 458 has been reported that a very well know web site was using AJAX 459 techniques and was opening up to 69 TCP ports per web page. If we 460 make the hypothesis of an address space reduction of a factor 100 461 (one IPv4 address per 100 customers), and 65k ports per IPv4 462 addresses available, that makes an average of N = 650 ports available 463 simultaneously to be shared among the various devices behind the 464 dual-stack lite tunnel end-point. 466 There is an important operational difference if those N ports are 467 pre-allocated in a cookie-cutter fashion versus allocated on demand 468 by incoming connections. This is a difference between an average of 469 N ports and a maximum of N ports. Several service providers have 470 reported an average number of connections per customer in the single 471 digits. At the opposite end, thousands or tens of thousands of ports 472 could be use in a peak by any single customer browsing a number of 473 AJAX/Web 2.0 sites. 475 As such, service providers allocating a fixed number of ports per 476 user should dimension the system with a minimum of N = several 477 thousands of ports for every user. This would bring the address 478 space reduction ratio to a single digit. Service providers using a 479 smaller number of ports per user (N in the hundreds) should expect 480 customers applications to break in a more or less random way over 481 time. 483 In order to achieve higher address space reduction ratios, it is 484 recommended that service provider do not use this cookie-cutter 485 approach, and, on the contrary, allocate ports as dynamically as 486 possible, just like on a regular NAT. With an average number of 487 connections per customers in the single digit, having an address 488 space reduction of a factor 100 is realistic. However, service 489 providers should exercise caution and make sure their pool of port 490 numbers does not go too low. The actual maximum address space 491 reduction factor is unknown at this time. 493 8.4.2. Dynamic port assignment considerations 495 When dynamic port assignment is used to maximize the number of 496 subscribers sharing the AFTR global IPv4 addresses, the AFTR should 497 implement checks to avoid DOS attack through exhaustion of available 498 ports. It should also avoid mapping any one subscriber's "flows" 499 across more than one global IPv4 address. 501 8.4.3. Subscriber controlled port assignment 503 Dynamic port assignment precludes inbound access to subscriber 504 servers, just as in a home gateway NAT. Inbound access to subscriber 505 servers can be provided through pre-assigned and/or reserved port 506 mappings in the AFTR. Specifying the mechanisms for managing and 507 signaling these reserved port mappings is out of scope for this 508 document. 510 8.5. Other considerations about sharing global IPv4 addresses 512 More considerations on sharing the port space of IPv4 addresses can 513 be found in [I-D.ford-shared-addressing-issues]. 515 9. Acknowledgements 517 The authors would like to acknowledge the role of Mark Townsley for 518 his input on the overall architecture of this technology by pointing 519 this work in the direction of [I-D.droms-softwires-snat]. Note that 520 this document results from a merging of [I-D.durand-dual-stack-lite] 521 and [I-D.droms-softwires-snat].Also to be acknowledged are the many 522 discussions with a number of people including Shin Miyakawa, 523 Katsuyasu Toyama, Akihide Hiura, Takashi Uematsu, Tetsutaro Hara, 524 Yasunori Matsubayashi, Ichiro Mizukoshi. The author would also like 525 to thank David Ward, Jari Arkko, Thomas Narten and Geoff Huston for 526 their constructive feedback. Special thanks go to Dave Thaler and 527 Dan Wing for their reviews and comments. 529 10. IANA Considerations 531 This draft request IANA to allocate a well know IPv4 192.0.0.0/29 532 network prefix. That range is used to number the dual-stack lite 533 interfaces. Reserving a /29 allows for 6 possible interfaces on a 534 multi-home node. The IPv4 address 192.0.0.1 is reserved as the IPv4 535 address of the default router for such dual-stack lite hosts. 537 11. Security Considerations 539 Security issues associated with NAT have long been documented. See 540 [RFC2663] and [RFC2993]. 542 However, moving the NAT functionality from the home gateway to the 543 core of the service provider network and sharing IPv4 addresses among 544 customers create additional requirements when logging data for abuse 545 usage. With any architecture where an IPv4 address does not uniquely 546 represent an end host, IPv4 addresses and a timestamps are no longer 547 sufficient to identify a particular broadband customer. Additional 548 information such as transport protocol information will be required 549 for that purpose. For example, we suggest to log the transport port 550 number for TCP and UDP connections. 552 The AFTR performs translation functions for interior IPv4 hosts at 553 RFC 1918 addresses or at the IANA reserved address range (TBA by 554 IANA). If the interior host is properly using the authorized IPv4 555 address with the authorized transport protocol port range such as A+P 556 semantic for the tunnel, the AFTR can simply forward without 557 translation to permit the authorized address and port range to 558 function properly. All packets with unauthorized interior IPv4 559 addresses or with authorized interior IPv4 address but unauthorized 560 port range MUST NOT be forwarded by the AFTR. This prevents rogue 561 devices from launching denial of service attacks using unauthorized 562 public IPv4 addresses in the IPv4 source header field or unauthorized 563 transport port range in the IPv4 transport header field. For 564 example, rogue devices could bombard a public web server by launching 565 TCP SYN ACK attack. The victim will receive TCP SYN from random IPv4 566 source addresses at a rapid rate and deny TCP services to legitimate 567 users. 569 With IPv4 addresses shared by multiple users, ports become a critical 570 resource. As such, some mechanisms need to be put in place by an 571 AFTR to limit port usage, either by rate-limiting new connections or 572 putting a hard limit on the maximum number of port usable by single 573 user. If this number is high enough, it should not interfere with 574 normal usage and still provide reasonable protection of the shared 575 pool. More considerations on ports allocation and port exhaustion 576 can be found in section 8.4. 578 More considerations on sharing IPv4 addresses can be found in 579 "I-D.ford-shared-addressing-issues". 581 AFTRs should support ways to limit service to registered customers. 582 If strict IPv6 ingress filtering is deployed in the broadband network 583 to prevent IPv6 address spoofing and dual-stack lite service is 584 restricted to those customers, then tunnels terminating at the AFTR 585 and coming from registered customer IPv6 addresses cannot be spoofed. 586 Thus a simple access control list on the tunnel transport source 587 address is all that is required to accept traffic on the southbound 588 interface of an AFTR. 590 If IPv6 address spoofing prevention is not in place, the AFTR should 591 perform further sanity checks on the IPv6 address of incoming IPv6 592 packets. For example, it should check if the address has really been 593 allocated to an authorized customer. 595 12. Author's Addresses 597 This document is the result of the work of the following authors: 599 Alain Durand 600 Juniper Networks 601 1194 North Mathilda Avenue 602 Sunnyvale, CA 94089-1206 603 USA 604 Email: adurand@juniper.net 606 Ralph Droms 607 Cisco 608 1414 Massachusetts Avenue 609 Boxborough, MA 01714 610 USA 611 Phone: +1 978.936.1674 612 Email: rdroms@cisco.com 614 Brian Haberman 615 Johns Hopkins University Applied Physics Lab 616 11100 Johns Hopkins Road 617 Laurel, MD 20723-6099 618 USA 619 Phone: +1 443 778 1319 620 Email: brian@innovationslab.net 622 James Woodyatt 623 Apple Inc. 624 1 Infinite Loop 625 Cupertino, CA 95014 626 USA 627 Email: jhw@apple.com 628 Yiu Lee 629 Comcast 630 1, Comcast center 631 Philadelphia, PA 19103 632 USA 633 Email: yiu_lee@cable.comcast.com 635 Randy Bush 636 Internet Initiative Japan 637 5147 Crystal Springs 638 Bainbridge Island, Washington 98110 639 USA 640 Phone: +1 206 780 0431 x1 641 Email: randy@psg.com 643 13. Appendix A: Deployment considerations 645 13.1. AFTR service distribution and horizontal scaling 647 One of the key benefits of the dual-stack lite technology lies in the 648 fact it is tunnel based. That is, tunnel end-points may be anywhere 649 in the service provider network. 651 Using the DHCPv6 tunnel end-point option, service providers can 652 create groups of users sharing the same AFTR. Those groups can be 653 merged or divided at will. This leads to an horizontally scaled 654 solution, where more capacity is added simply by adding more boxes. 655 As those groups of users can evolve over time, it is best to make 656 sure that AFTRs do not require per-user configuration in order to 657 provide service. 659 13.2. Horizontal scaling 661 A service provider can start using just a few AFTR centrally located. 662 Later, when more capacity is needed, more boxes can be added and 663 pushed to the edges of the access network. In case of a spike of 664 traffic, for example during the Olympic games or an important 665 political event, capacity can be quickly added in any location of the 666 network (tunnels can terminate anywhere) simply by splitting user 667 groups. Extra capacity can be later removed when the traffic returns 668 to normal by resetting the DHCPv6 tunnel end-point settings. 670 13.3. High availability 672 An important element in the design of the dual-stack lite technology 673 is the simplicity of implementation on the customer side. A simple 674 IP4-in-IPv6 tunnel and a default route over it is all is needed to 675 get IPv4 connectivity. Dealing with high availability is the 676 responsibility of the service provider, not the customer devices 677 implementing dual-stack lite. As such, a single IPv6 address of the 678 tunnel end-point is provided in the DHCPv6 option defined in 679 [I-D.ietf-softwire-ds-lite-tunnel-option]. The service provider can 680 use techniques such as anycast or various types of clusters to ensure 681 availability of the IPv4 service. The exact synchronization (or lack 682 thereof) between redundant AFTRs is out of scope for this document. 684 13.4. Logging 686 DS-Lite AFTR implementation should offer the possility to log NAT 687 binding creations or other ways to keep track of the ports/IP 688 addresses used by customers. This is both to support 689 troubleshooting, which is very important to service providers trying 690 to figure out why something may not be working, as well as to meet 691 region-specific requirements for responding to legally-binding 692 requests for information from law enforcement authorities. 694 14. Appendix B: Examples 696 14.1. Gateway based architecture 698 This architecture is targeted at residential broadband deployments 699 but can be adapted easily to other types of deployment where the 700 installed base of IPv4-only devices is important. 702 Consider a scenario where a Dual-Stack lite home gateway is 703 provisioned only with IPv6 in the WAN port, no IPv4. The home 704 gateway acts as an IPv4 DCHP server for the LAN network (wireline and 705 wireless) handing out RFC1918 addresses. In addition, the home 706 gateway may support IPv6 Auto-Configuration and/or DHCPv6 server for 707 the LAN network. When an IPv4-only device connects to the home 708 gateway, the gateway will hand it out a RFC1918 address. When a 709 dual-stack capable device connects to the home gateway, the gateway 710 will hand out a RFC1918 address and a global IPv6 address to the 711 device. Besides, the home gateway will create an IPv4-in-IPv6 712 softwire tunnel [RFC5571]to an AFTR that resides in the service 713 provider network. 715 When the device accesses IPv6 service, it will send the IPv6 datagram 716 to the home gateway natively. The home gateway will route the 717 traffic upstream to the default gateway. 719 When the device accesses IPv4 service, it will source the IPv4 720 datagram with the RFC1918 address and send the IPv4 datagram to the 721 home gateway. The home gateway will encapsulate the IPv4 datagram 722 inside the IPv4-in-IPv6 softwire tunnel and forward the IPv6 datagram 723 to the AFTR. This contrasts what the home gateways normally do today 724 which will NAT the RFC1918 address to the public IPv4 address and 725 route the datagram upstream. When the AFTR receives the IPv6 726 datagram, it will decapsulate the IPv6 header and perform an IPv4-to- 727 IPv4 NAT on the source address. 729 As illustrated in Figure 1, this dual-stack lite deployment model 730 consists of three components: the dual-stack lite home router with a 731 B4 element, the AFTR and a softwire between the B4 element acting as 732 softwire initiator (SI) [RFC5571] in the dual-stack lite home router 733 and the softwire concentrator (SC) [RFC5571] in the AFTR. The AFTR 734 performs IPv4-IPv4 NAT translations to multiplex multiple subscribers 735 through a pool of global IPv4 address. Overlapping address spaces 736 used by subscribers are disambiguated through the identification of 737 tunnel endpoints. 739 +-----------+ 740 | Host | 741 +-----+-----+ 742 |10.0.0.1 743 | 744 | 745 |10.0.0.2 746 +---------|---------+ 747 | | | 748 | Home router | 749 |+--------+--------+| 750 || B4 || 751 |+--------+--------+| 752 +--------|||--------+ 753 |||2001:db8:0:1::1 754 ||| 755 |||<-IPv4-in-IPv6 softwire 756 ||| 757 -------|||------- 758 / ||| \ 759 | ISP core network | 760 \ ||| / 761 -------|||------- 762 ||| 763 |||2001:db8:0:2::1 764 +--------|||--------+ 765 | AFTR | 766 |+--------+--------+| 767 || Concentrator || 768 |+--------+--------+| 769 | |NAT| | 770 | +-+-+ | 771 +---------|---------+ 772 |192.0.2.1 773 | 774 --------|-------- 775 / | \ 776 | Internet | 777 \ | / 778 --------|-------- 779 | 780 |198.51.100.1 781 +-----+-----+ 782 | IPv4 Host | 783 +-----------+ 785 Figure 1: gateway-based architecture 787 Notes: 789 o The dual-stack lite home router is not required to be on the same 790 link as the host 792 o The dual-stack lite home router could be replaced by a dual-stack 793 lite router in the service provider network 795 The resulting solution accepts an IPv4 datagram that is translated 796 into an IPv4-in-IPv6 softwire datagram for transmission across the 797 softwire. At the corresponding endpoint, the IPv4 datagram is 798 decapsulated, and the translated IPv4 address is inserted based on a 799 translation from the softwire. 801 14.1.1. Example message flow 803 In the example shown in Figure 2, the translation tables in the AFTR 804 is configured to forward between IP/TCP (10.0.0.1/10000) and IP/TCP 805 (192.0.2.1/5000). That is, a datagram received by the dual-stack 806 lite home router from the host at address 10.0.0.1, using TCP DST 807 port 10000 will be translated a datagram with IP SRC address 808 192.0.2.1 and TCP SRC port 5000 in the Internet. 810 +-----------+ 811 | Host | 812 +-----+-----+ 813 | |10.0.0.1 814 IPv4 datagram 1 | | 815 | | 816 v |10.0.0.2 817 +---------|---------+ 818 | | | 819 | home router | 820 |+--------+--------+| 821 || B4 || 822 |+--------+--------+| 823 +--------|||--------+ 824 | |||2001:db8:0:1::1 825 IPv6 datagram 2| ||| 826 | |||<-IPv4-in-IPv6 softwire 827 | ||| 828 -----|-|||------- 829 / | ||| \ 830 | ISP core network | 831 \ | ||| / 832 -----|-|||------- 833 | ||| 834 | |||2001:db8:0:2::1 835 +------|-|||--------+ 836 | | AFTR | 837 | v ||| | 838 |+--------+--------+| 839 || Concentrartor || 840 |+--------+--------+| 841 | |NAT| | 842 | +-+-+ | 843 +---------|---------+ 844 | |192.0.2.1 845 IPv4 datagram 3 | | 846 | | 847 -----|--|-------- 848 / | | \ 849 | Internet | 850 \ | | / 851 -----|--|-------- 852 | | 853 v |198.51.100.1 854 +-----+-----+ 855 | IPv4 Host | 856 +-----------+ 857 Figure 2: Outbound Datagram 859 +-----------------+--------------+-----------------+ 860 | Datagram | Header field | Contents | 861 +-----------------+--------------+-----------------+ 862 | IPv4 datagram 1 | IPv4 Dst | 198.51.100.1 | 863 | | IPv4 Src | 10.0.0.1 | 864 | | TCP Dst | 80 | 865 | | TCP Src | 10000 | 866 | --------------- | ------------ | ------------- | 867 | IPv6 Datagram 2 | IPv6 Dst | 2001:db8:0:2::1 | 868 | | IPv6 Src | 2001:db8:0:1::1 | 869 | | IPv4 Dst | 198.51.100.1 | 870 | | IPv4 Src | 10.0.0.1 | 871 | | TCP Dst | 80 | 872 | | TCP Src | 10000 | 873 | --------------- | ------------ | ------------- | 874 | IPv4 datagram 3 | IPv4 Dst | 198.51.100.1 | 875 | | IPv4 Src | 192.0.2.1 | 876 | | TCP Dst | 80 | 877 | | TCP Src | 5000 | 878 +-----------------+--------------+-----------------+ 880 Datagram header contents 882 When datagram 1 is received by the dual-stack lite home router, the 883 B4 function encapsulates the datagram in datagram 2 and forwards it 884 to the dual-stack lite carrier-grade NAT over the softwire. 886 When it receives datagram 2, the tunnel concentrator in the AFTR 887 hands the IPv4 datagram to the NAT, which determines from its 888 translation table that the datagram received on Softwire_1 with TCP 889 SRC port 10000 should be translated to datagram 3 with IP SRC address 890 192.0.2.1 and TCP SRC port 5000. 892 Figure 3 shows an inbound message received at the AFTR. When the NAT 893 function in the AFTR receives datagram 1, it looks up the IP/TCP DST 894 in its translation table. In the example in Figure 3, the NAT 895 translates the TCP DST port to 10000, sets the IP DST address to 896 10.0.0.1 and hands the datagram to the SC for transmission over 897 Softwire_1. The B4 in the home router decapsulates IPv4 datagram 898 from the inbound softwire datagram, and forwards it to the host. 900 +-----------+ 901 | Host | 902 +-----+-----+ 903 ^ |10.0.0.1 904 IPv4 datagram 3 | | 905 | | 906 | |10.0.0.2 907 +---------|---------+ 908 | +-+-+ | 909 | home router | 910 |+--------+--------+| 911 || B4 || 912 |+--------+--------+| 913 +--------|||--------+ 914 ^ |||2001:db8:0:1::1 915 IPv6 datagram 2 | ||| 916 | |||<-IPv4-in-IPv6 softwire 917 | ||| 918 -----|-|||------- 919 / | ||| \ 920 | ISP core network | 921 \ | ||| / 922 -----|-|||------- 923 | ||| 924 | |||2001:db8:0:2::1 925 +------|-|||--------+ 926 | AFTR | 927 |+--------+--------+| 928 || Concentrator || 929 |+--------+--------+| 930 | |NAT| | 931 | +-+-+ | 932 +---------|---------+ 933 ^ |192.0.2.1 934 IPv4 datagram 1 | | 935 | | 936 -----|--|-------- 937 / | | \ 938 | Internet | 939 \ | | / 940 -----|--|-------- 941 | | 942 | |198.51.100.1 943 +-----+-----+ 944 | IPv4 Host | 945 +-----------+ 947 Figure 3: Inbound Datagram 949 +-----------------+--------------+-----------------+ 950 | Datagram | Header field | Contents | 951 +-----------------+--------------+-----------------+ 952 | IPv4 datagram 1 | IPv4 Dst | 192.0.2.1 | 953 | | IPv4 Src | 198.51.100.1 | 954 | | TCP Dst | 5000 | 955 | | TCP Src | 80 | 956 | --------------- | ------------ | ------------- | 957 | IPv6 Datagram 2 | IPv6 Dst | 2001:db8:0:1::1 | 958 | | IPv6 Src | 2001:db8:0:2::1 | 959 | | IPv4 Dst | 10.0.0.1 | 960 | | IP Src | 198.51.100.1 | 961 | | TCP Dst | 10000 | 962 | | TCP Src | 80 | 963 | --------------- | ------------ | ------------- | 964 | IPv4 datagram 3 | IPv4 Dst | 10.0.0.1 | 965 | | IPv4 Src | 198.51.100.1 | 966 | | TCP Dst | 10000 | 967 | | TCP Src | 80 | 968 +-----------------+--------------+-----------------+ 970 Datagram header contents 972 14.1.2. Translation details 974 The AFTR has a NAT that translates between softwire/port pairs and 975 IPv4-address/port pairs. The same translation is applied to IPv4 976 datagrams received on the device's external interface and from the 977 softwire endpoint in the device. 979 In Figure 2, the translator network interface in the AFTR is on the 980 Internet, and the softwire interface connects to the dual-stack lite 981 home router. The AFTR translator is configured as follows: 983 Network interface: Translate IPv4 destination address and TCP 984 destination port to the softwire identifier and TCP destination 985 port 987 Softwire interface: Translate softwire identifier and TCP source 988 port to IPv4 source address and TCP source port 990 Here is how the translation in Figure 3 works: 992 o Datagram 1 is received on the AFTR translator network interface. 993 The translator looks up the IPv4-address/port pair in its 994 translator table, rewrites the IPv4 destination address to 995 10.0.0.1 and the TCP source port to 10000, and hands the datagram 996 to the SE to be forwarded over the softwire. 998 o The IPv4 datagram is received on the dual-stack lite home router 999 B4. The B4 function extracts the IPv4 datagram and the dual-stack 1000 lite home router forwards datagram 3 to the host. 1002 +------------------------------------+--------------------+ 1003 | Softwire-Id/IPv4/Prot/Port | IPv4/Prot/Port | 1004 +------------------------------------+--------------------+ 1005 | 2001:db8:0:1::1/10.0.0.1/TCP/10000 | 192.0.2.1/TCP/5000 | 1006 +------------------------------------+--------------------+ 1008 Dual-Stack lite carrier-grade NAT translation table 1010 The Softwire-Id is the IPv6 address assigned to the Dual-Stack lite 1011 home gateway. Hosts behind the same Dual-Stack lite home router have 1012 the same Softwire-Id. The source IPv4 is the RFC1918 addressed 1013 assigned by the Dual-Stack home router which is unique to each host 1014 behind the home gateway. The AFTR would receive packets sourced from 1015 different IPv4 addresses in the same softwire tunnel. The AFTR 1016 combines the Softwire-Id and IPv4 address/Port [Softwire-Id, IPv4+ 1017 Port] to uniquely identify the host behind the same Dual-Stack lite 1018 home router. 1020 14.2. Host based architecture 1022 This architecture is targeted at new, large scale deployments of 1023 dual-stack capable devices implementing a dual-stack lite interface. 1025 Consider a scenario where a Dual-Stack lite host device is directly 1026 connected to the service provider network. The host device is dual- 1027 stack capable but only provisioned an IPv6 global address. Besides, 1028 the host device will pre-configure a well-known IPv4 non-routable 1029 address (see IANA section). This well-known IPv4 non-routable 1030 address is similar to the 127.0.0.1 loopback address. Every host 1031 device implemented Dual-Stack lite will pre-configure the same 1032 address. This address will be used to source the IPv4 datagram when 1033 the device accesses IPv4 services. Besides, the host device will 1034 create an IPv4-in-IPv6 softwire tunnel to an AFTR. The Carrier Grade 1035 NAT will reside in the service provider network. 1037 When the device accesses IPv6 service, the device will send the IPv6 1038 datagram natively to the default gateway. 1040 When the device accesses IPv4 service, it will source the IPv4 1041 datagram with the well-known non-routable IPv4 address. Then, the 1042 host device will encapsulate the IPv4 datagram inside the IPv4-in- 1043 IPv6 softwire tunnel and send the IPv6 datagram to the AFTR. When 1044 the AFTR receives the IPv6 datagram, it will decapsulate the IPv6 1045 header and perform IPv4-to-IPv4 NAT on the source address. 1047 This scenario works on both wireline and wireless networks. A 1048 typical wireless device will connect directly to the service provider 1049 without home gateway in between. 1051 As illustrated in Figure 4, this dual-stack lite deployment model 1052 consists of three components: the dual-stack lite host, the AFTR and 1053 a softwire between the softwire initiator B4 in the host and the 1054 softwire concentrator in the AFTR. The dual-stack lite host is 1055 assumed to have IPv6 service and can exchange IPv6 traffic with the 1056 AFTR. 1058 The AFTR performs IPv4-IPv4 NAT translations to multiplex multiple 1059 subscribers through a pool of global IPv4 address. Overlapping IPv4 1060 address spaces used by the dual-stack lite hosts are disambiguated 1061 through the identification of tunnel endpoints. 1063 In this situation, the dual-stack lite host configures the IPv4 1064 address 192.0.0.2 out of the well-known range 192.0.0.0/29 (defined 1065 by IANA) on its B4 interface. It also configure the first non- 1066 reserved IPv4 address of the reserved range, 192.0.0.1 as the address 1067 of its default gateway. 1069 +-------------------+ 1070 | | 1071 | Host 192.0.0.2 | 1072 |+--------+--------+| 1073 || B4 || 1074 |+--------+--------+| 1075 +--------|||--------+ 1076 |||2001:db8:0:1::1 1077 ||| 1078 |||<-IPv4-in-IPv6 softwire 1079 ||| 1080 -------|||------- 1081 / ||| \ 1082 | ISP core network | 1083 \ ||| / 1084 -------|||------- 1085 ||| 1086 |||2001:db8:0:2::1 1087 +--------|||--------+ 1088 | AFTR | 1089 |+--------+--------+| 1090 || Concentrator || 1091 |+--------+--------+| 1092 | |NAT| | 1093 | +-+-+ | 1094 +---------|---------+ 1095 |192.0.2.1 1096 | 1097 --------|-------- 1098 / | \ 1099 | Internet | 1100 \ | / 1101 --------|-------- 1102 | 1103 |198.51.100.1 1104 +-----+-----+ 1105 | IPv4 Host | 1106 +-----------+ 1108 Figure 4: host-based architecture 1110 The resulting solution accepts an IPv4 datagram that is translated 1111 into an IPv4-in-IPv6 softwire datagram for transmission across the 1112 softwire. At the corresponding endpoint, the IPv4 datagram is 1113 decapsulated, and the translated IPv4 address is inserted based on a 1114 translation from the softwire. 1116 14.2.1. Example message flow 1118 In the example shown in Figure 5, the translation tables in the AFTR 1119 is configured to forward between IP/TCP (a.b.c.d/10000) and IP/TCP 1120 (192.0.2.1/5000). That is, a datagram received from the host at 1121 address 192.0.0.2, using TCP DST port 10000 will be translated a 1122 datagram with IP SRC address 192.0.2.1 and TCP SRC port 5000 in the 1123 Internet. 1125 +-------------------+ 1126 | | 1127 |Host 192.0.0.2 | 1128 |+--------+--------+| 1129 || B4 || 1130 |+--------+--------+| 1131 +--------|||--------+ 1132 | |||2001:db8:0:1::1 1133 IPv6 datagram 1| ||| 1134 | |||<-IPv4-in-IPv6 softwire 1135 | ||| 1136 -----|-|||------- 1137 / | ||| \ 1138 | ISP core network | 1139 \ | ||| / 1140 -----|-|||------- 1141 | ||| 1142 | |||2001:db8:0:2::1 1143 +------|-|||--------+ 1144 | | AFTR | 1145 | v ||| | 1146 |+--------+--------+| 1147 || Concentrator || 1148 |+--------+--------+| 1149 | |NAT| | 1150 | +-+-+ | 1151 +---------|---------+ 1152 | |192.0.2.1 1153 IPv4 datagram 2 | | 1154 -----|--|-------- 1155 / | | \ 1156 | Internet | 1157 \ | | / 1158 -----|--|-------- 1159 | | 1160 v |198.51.100.1 1161 +-----+-----+ 1162 | IPv4 Host | 1163 +-----------+ 1165 Figure 5: Outbound Datagram 1167 +-----------------+--------------+-----------------+ 1168 | Datagram | Header field | Contents | 1169 +-----------------+--------------+-----------------+ 1170 | IPv6 Datagram 1 | IPv6 Dst | 2001:db8:0:2::1 | 1171 | | IPv6 Src | 2001:db8:0:1::1 | 1172 | | IPv4 Dst | 198.51.100.1 | 1173 | | IPv4 Src | a.b.c.d | 1174 | | TCP Dst | 80 | 1175 | | TCP Src | 10000 | 1176 | --------------- | ------------ | ------------- | 1177 | IPv4 datagram 2 | IPv4 Dst | 198.51.100.1 | 1178 | | IPv4 Src | 192.0.2.1 | 1179 | | TCP Dst | 80 | 1180 | | TCP Src | 5000 | 1181 +-----------------+--------------+-----------------+ 1183 Datagram header contents 1185 When sending an IPv4 packet, the dual-stack lite host encapsulates it 1186 in datagram 1 and forwards it to the AFTR over the softwire. 1188 When it receives datagram 1, the concentrator in the AFTR hands the 1189 IPv4 datagram to the NAT, which determines from its translation table 1190 that the datagram received on Softwire_1 with TCP SRC port 10000 1191 should be translated to datagram 3 with IP SRC address 192.0.2.1 and 1192 TCP SRC port 5000. 1194 Figure 6 shows an inbound message received at the AFTR. When the NAT 1195 function in the AFTR receives datagram 1, it looks up the IP/TCP DST 1196 in its translation table. In the example in Figure 3, the NAT 1197 translates the TCP DST port to 10000, sets the IP DST address to 1198 a.b.c.d and hands the datagram to the concentrator for transmission 1199 over Softwire_1. The B4 in the dual-stack lite hosts decapsulates 1200 IPv4 datagram from the inbound softwire datagram, and forwards it to 1201 the host. 1203 +-------------------+ 1204 | | 1205 |Host 192.0.0.2 | 1206 |+--------+--------+| 1207 || B4 || 1208 |+--------+--------+| 1209 +--------|||--------+ 1210 ^ |||2001:db8:0:1::1 1211 IPv6 datagram 2 | ||| 1212 | |||<-IPv4-in-IPv6 softwire 1213 | ||| 1214 -----|-|||------- 1215 / | ||| \ 1216 | ISP core network | 1217 \ | ||| / 1218 -----|-|||------- 1219 | ||| 1220 | |||2001:db8:0:2::1 1221 +------|-|||--------+ 1222 | AFTR | 1223 | | ||| | 1224 |+--------+--------+| 1225 || Concentrator || 1226 |+--------+--------+| 1227 | |NAT| | 1228 | +-+-+ | 1229 +---------|---------+ 1230 ^ |192.0.2.1 1231 IPv4 datagram 1 | | 1232 -----|--|-------- 1233 / | | \ 1234 | Internet | 1235 \ | | / 1236 -----|--|-------- 1237 | | 1238 | |198.51.100.1 1239 +-----+-----+ 1240 | IPv4 Host | 1241 +-----------+ 1243 Figure 6: Inbound Datagram 1245 +-----------------+--------------+-----------------+ 1246 | Datagram | Header field | Contents | 1247 +-----------------+--------------+-----------------+ 1248 | IPv4 datagram 1 | IPv4 Dst | 192.0.2.1 | 1249 | | IPv4 Src | 198.51.100.1 | 1250 | | TCP Dst | 5000 | 1251 | | TCP Src | 80 | 1252 | --------------- | ------------ | ------------- | 1253 | IPv6 Datagram 2 | IPv6 Dst | 2001:db8:0:1::1 | 1254 | | IPv6 Src | 2001:db8:0:2::1 | 1255 | | IPv4 Dst | a.b.c.d | 1256 | | IP Src | 198.51.100.1 | 1257 | | TCP Dst | 10000 | 1258 | | TCP Src | 80 | 1259 +-----------------+--------------+-----------------+ 1261 Datagram header contents 1263 14.2.2. Translation details 1265 The translations happening in the AFTR are the same as in the 1266 previous examples. The well known IPv4 address 192.0.0.2 out of the 1267 192.0.0.0/29 (defined by IANA) range used by all the hosts are 1268 disambiguated by the IPv6 source address of the softwire. 1270 +-----------------------------------+--------------------+ 1271 | Softwire-Id/IPv4/Prot/Port | IPv4/Prot/Port | 1272 +-----------------------------------+--------------------+ 1273 | 2001:db8:0:1::1/a.b.c.d/TCP/10000 | 192.0.2.1/TCP/5000 | 1274 +-----------------------------------+--------------------+ 1276 Dual-Stack lite carrier-grade NAT translation table 1278 The Softwire-Id is the IPv6 address assigned to the Dual-Stack host. 1279 Each host has an unique Softwire-Id. The source IPv4 address is one 1280 of the well-known IPv4 address. The AFTR could receive packets from 1281 different hosts sourced from the same IPv4 well-known address from 1282 different softwire tunnels. Similar to the gateway architecture, the 1283 AFTR combines the Softwire-Id and IPv4 address/Port [Softwire-Id, 1284 IPv4+Port] to uniquely identify the individual host. 1286 15. Appendix C: Related DS-Lite work on port management 1288 Techniques discussed below are not part of the core dual-stack lite 1289 specification and may or may not be standardized in separate 1290 documents. They are only listed here for reference. 1292 Applications expecting incoming connections, such a peer-to-peer 1293 applications, have become popular. Those applications use a very 1294 limited number of ports, usually a single one. Making sure those 1295 applications keep working in a dual-stack lite environment is 1296 important. Similarly, there is a growing list of applications that 1297 require some kind of ALG to work through a NAT. Service provider 1298 AFTRs should not prevent the deployment of such applications. As 1299 such, there is a legitimate need to leave certain ports under the 1300 control of the end user or its applications. This argues for a 1301 hybrid environment, where most ports are dynamically managed by the 1302 AFTR in a shared pool and a limited number are dedicated per users 1303 and controlled by them. 1305 The details of how ports can be controlled by users and applications 1306 are beyond the scope of this document. For reference, the A+P 1307 [I-D.ymbk-aplusp] model where an address and a set of ports are 1308 assigned to users has been extensively discussed. User controled 1309 techniques for port allocation via a service provider portal or a 1310 DHCPv6 option [I-D.bajko-v6ops-port-restricted-ipaddr-assign] have 1311 been proposed. Techniques using some form of port control protocol 1312 such as UPnP [UPnP-IGD], NAT-PMP [I-D.cheshire-nat-pmp] and PCP 1313 [I-D.wing-softwire-port-control-protocol] are under discussion to 1314 enable a direct communication beetween applications and the service 1315 provider NAT. 1317 16. References 1319 16.1. Normative references 1321 [I-D.ietf-softwire-ds-lite-tunnel-option] 1322 Hankins, D. and T. Mrugalski, "Dynamic Host Configuration 1323 Protocol for IPv6 (DHCPv6) Options for Dual- Stack Lite", 1324 draft-ietf-softwire-ds-lite-tunnel-option-03 (work in 1325 progress), June 2010. 1327 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1328 Requirement Levels", BCP 14, RFC 2119, March 1997. 1330 [RFC2473] Conta, A. and S. Deering, "Generic Packet Tunneling in 1331 IPv6 Specification", RFC 2473, December 1998. 1333 [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, 1334 "Definition of the Differentiated Services Field (DS 1335 Field) in the IPv4 and IPv6 Headers", RFC 2474, 1336 December 1998. 1338 [RFC4213] Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms 1339 for IPv6 Hosts and Routers", RFC 4213, October 2005. 1341 [RFC5625] Bellis, R., "DNS Proxy Implementation Guidelines", 1342 BCP 152, RFC 5625, August 2009. 1344 16.2. Informative references 1346 [I-D.bajko-v6ops-port-restricted-ipaddr-assign] 1347 Bajko, G. and T. Savolainen, "Port Restricted IP Address 1348 Assignment", 1349 draft-bajko-v6ops-port-restricted-ipaddr-assign-02 (work 1350 in progress), November 2008. 1352 [I-D.cheshire-nat-pmp] 1353 Cheshire, S., "NAT Port Mapping Protocol (NAT-PMP)", 1354 draft-cheshire-nat-pmp-03 (work in progress), April 2008. 1356 [I-D.droms-softwires-snat] 1357 Droms, R. and B. Haberman, "Softwires Network Address 1358 Translation (SNAT)", draft-droms-softwires-snat-01 (work 1359 in progress), July 2008. 1361 [I-D.durand-dual-stack-lite] 1362 Durand, A., "Dual-stack lite broadband deployments post 1363 IPv4 exhaustion", draft-durand-dual-stack-lite-00 (work in 1364 progress), July 2008. 1366 [I-D.ford-shared-addressing-issues] 1367 Ford, M., Boucadair, M., Durand, A., Levis, P., and P. 1368 Roberts, "Issues with IP Address Sharing", 1369 draft-ford-shared-addressing-issues-02 (work in progress), 1370 March 2010. 1372 [I-D.nishitani-cgn] 1373 Yamagata, I., Nishitani, T., Miyakawa, S., Nakagawa, A., 1374 and H. Ashida, "Common requirements for IP address sharing 1375 schemes", draft-nishitani-cgn-04 (work in progress), 1376 March 2010. 1378 [I-D.templin-seal] 1379 Templin, F., "The Subnetwork Encapsulation and Adaptation 1380 Layer (SEAL)", draft-templin-seal-23 (work in progress), 1381 August 2008. 1383 [I-D.wing-softwire-port-control-protocol] 1384 Wing, D., Penno, R., and M. Boucadair, "Pinhole Control 1385 Protocol (PCP)", 1386 draft-wing-softwire-port-control-protocol-02 (work in 1387 progress), July 2010. 1389 [I-D.ymbk-aplusp] 1390 Bush, R., "The A+P Approach to the IPv4 Address Shortage", 1391 draft-ymbk-aplusp-05 (work in progress), October 2009. 1393 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 1394 November 1990. 1396 [RFC1918] Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G., and 1397 E. Lear, "Address Allocation for Private Internets", 1398 BCP 5, RFC 1918, February 1996. 1400 [RFC2663] Srisuresh, P. and M. Holdrege, "IP Network Address 1401 Translator (NAT) Terminology and Considerations", 1402 RFC 2663, August 1999. 1404 [RFC2993] Hain, T., "Architectural Implications of NAT", RFC 2993, 1405 November 2000. 1407 [RFC4787] Audet, F. and C. Jennings, "Network Address Translation 1408 (NAT) Behavioral Requirements for Unicast UDP", BCP 127, 1409 RFC 4787, January 2007. 1411 [RFC5382] Guha, S., Biswas, K., Ford, B., Sivakumar, S., and P. 1412 Srisuresh, "NAT Behavioral Requirements for TCP", BCP 142, 1413 RFC 5382, October 2008. 1415 [RFC5508] Srisuresh, P., Ford, B., Sivakumar, S., and S. Guha, "NAT 1416 Behavioral Requirements for ICMP", BCP 148, RFC 5508, 1417 April 2009. 1419 [RFC5571] Storer, B., Pignataro, C., Dos Santos, M., Stevant, B., 1420 Toutain, L., and J. Tremblay, "Softwire Hub and Spoke 1421 Deployment Framework with Layer Two Tunneling Protocol 1422 Version 2 (L2TPv2)", RFC 5571, June 2009. 1424 [UPnP-IGD] 1425 UPnP Forum, "Universal Plug and Play Internet Gateway 1426 Device Standardized Gateway Device Protocol", 1427 September 2006, 1428 . 1430 Author's Address 1432 Alain Durand (editor) 1433 Juniper Networks 1434 1194 North Mathilda Avenue 1435 Sunnyvale, CA 94089-1206 1436 USA 1438 Email: adurand@juniper.net