idnits 2.17.1 

draft-ymbk-aplusp-05.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** The document seems to lack a License Notice according IETF Trust
     Provisions of 28 Dec 2009, Section 6.b.ii or Provisions of 12 Sep 2009
     Section 6.b -- however, there's a paragraph with a matching beginning.
     Boilerplate error?

     (You're using the IETF Trust Provisions' Section 6.b License Notice from
     12 Feb 2009 rather than one of the newer Notices.  See
     https://trustee.ietf.org/license-info/.)

  -- The document has an IETF Trust Provisions (28 Dec 2009) Section 6.c(ii)
     Publication Limitation clause.  If this document is intended for
     submission to the IESG for publication, this constitutes an error.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  == There are 19 instances of lines with non-RFC6890-compliant IPv4
     addresses in the document.  If these are example addresses, they should
     be changed.

  == There are 6 instances of lines with private range IPv4 addresses in the
     document.  If these are generic example addresses, they should be changed
     to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x,
     198.51.100.x or 203.0.113.x.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (October 27, 2009) is 5295 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Outdated reference: A later version (-04) exists of
     draft-bajko-pripaddrassign-01

  == Outdated reference: A later version (-01) exists of
     draft-boucadair-dhcpv6-shared-address-option-00

  == Outdated reference: A later version (-09) exists of
     draft-boucadair-pppext-portrange-option-01

  == Outdated reference: A later version (-11) exists of
     draft-ietf-softwire-dual-stack-lite-01


     Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                       R. Bush, Ed.
3	Internet-Draft                                 Internet Initiative Japan
4	Intended status: Standards Track                        October 27, 2009
5	Expires: April 30, 2010

7	             The A+P Approach to the IPv4 Address Shortage
8	                          draft-ymbk-aplusp-05

10	Status of this Memo

12	   This Internet-Draft is submitted to IETF in full conformance with the
13	   provisions of BCP 78 and BCP 79.  This document may not be modified,
14	   and derivative works of it may not be created, and it may not be
15	   published except as an Internet-Draft.

17	   Internet-Drafts are working documents of the Internet Engineering
18	   Task Force (IETF), its areas, and its working groups.  Note that
19	   other groups may also distribute working documents as Internet-
20	   Drafts.

22	   Internet-Drafts are draft documents valid for a maximum of six months
23	   and may be updated, replaced, or obsoleted by other documents at any
24	   time.  It is inappropriate to use Internet-Drafts as reference
25	   material or to cite them other than as "work in progress."

27	   The list of current Internet-Drafts can be accessed at
28	   http://www.ietf.org/ietf/1id-abstracts.txt.

30	   The list of Internet-Draft Shadow Directories can be accessed at
31	   http://www.ietf.org/shadow.html.

33	   This Internet-Draft will expire on April 30, 2010.

35	Copyright Notice

37	   Copyright (c) 2009 IETF Trust and the persons identified as the
38	   document authors.  All rights reserved.

40	   This document is subject to BCP 78 and the IETF Trust's Legal
41	   Provisions Relating to IETF Documents in effect on the date of
42	   publication of this document (http://trustee.ietf.org/license-info).
43	   Please review these documents carefully, as they describe your rights
44	   and restrictions with respect to this document.

46	Abstract

48	   We are facing the exhaustion of the IANA IPv4 free IP address pool.

50	   Unfortunately, IPv6 is not yet deployed widely enough to fully
51	   replace IPv4, and it is unrealistic to expect that this is going to
52	   change before we run out of IPv4 addresses.  Letting hosts seamlessly
53	   communicate in an IPv4-world without assigning a unique globally
54	   routable IPv4 address to each of them is a challenging problem.

56	   This draft discusses the possibility of address sharing by treating
57	   some of the port number bits as part of an extended IPv4 address
58	   (Address plus Port, or A+P).  Instead of assigning a single IPv4
59	   address to a customer device, we propose to extended the address by
60	   "stealing" bits from the port number in the TCP/UDP header, leaving
61	   the applications a reduced range of ports.  This means assigning the
62	   same IPv4 address to multiple clients (e.g., CPE, mobile phones),
63	   each with its assigned port-range.  In the face of IPv4 address
64	   exhaustion, the need for addresses is stronger than the need to be
65	   able to address thousands of applications on a single host.  If
66	   address translation is needed, the end-user should be in control of
67	   the translation process - not some smart boxes in the core.

69	Requirements Language

71	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
72	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
73	   document are to be interpreted as described in RFC 2119 [RFC2119].

75	Table of Contents

77	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
78	     1.1.  Why Carrier Grade NATs are Harmful . . . . . . . . . . . .  4
79	   2.  Design Constraints and Assumptions . . . . . . . . . . . . . .  6
80	     2.1.  Design constraints . . . . . . . . . . . . . . . . . . . .  6
81	     2.2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . .  7
82	   3.  Overview of the A+P Solution . . . . . . . . . . . . . . . . .  8
83	     3.1.  Signaling  . . . . . . . . . . . . . . . . . . . . . . . . 10
84	     3.2.  Address realm  . . . . . . . . . . . . . . . . . . . . . . 11
85	     3.3.  Reasons for allowing multiple A+P gateways . . . . . . . . 14
86	   4.  Deployment Scenarios . . . . . . . . . . . . . . . . . . . . . 16
87	     4.1.  A+P for Broadband Providers  . . . . . . . . . . . . . . . 16
88	     4.2.  A+P for Mobile Providers . . . . . . . . . . . . . . . . . 17
89	     4.3.  A+P from the provider network perspective  . . . . . . . . 17
90	     4.4.  Dynamic allocation of port ranges  . . . . . . . . . . . . 20
91	     4.5.  Overall A+P architecture . . . . . . . . . . . . . . . . . 22
92	     4.6.  Example of A+P-forwarded packets . . . . . . . . . . . . . 22
93	     4.7.  Forwarding of standard packets . . . . . . . . . . . . . . 27
94	     4.8.  Handling ICMP  . . . . . . . . . . . . . . . . . . . . . . 27
95	     4.9.  Limitations of the A+P approach  . . . . . . . . . . . . . 28
96	   5.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 28
97	   6.  Security Considerations  . . . . . . . . . . . . . . . . . . . 28
98	   7.  Authors  . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
99	   8.  Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 31
100	   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 32
101	     9.1.  Normative References . . . . . . . . . . . . . . . . . . . 32
102	     9.2.  Informative References . . . . . . . . . . . . . . . . . . 32
103	   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 33

105	1.  Introduction

107	   This document describes a technique to deal with the imminent IPv4
108	   address space exhaustion.  Many large Internet Service Providers
109	   (ISPs) face the problem that their networks' customer edges are so
110	   large that it will soon not be possible to provide each customer with
111	   a unique IPv4 address.  Therefore these ISPs have to devise something
112	   more ingenious.  Although undesirable, address sharing, a la NAT, is
113	   inevitable.

115	   To allow end-to-end connectivity between IPv4 speaking applications
116	   we propose to "steal" some bits from the UDP/TCP header and use them
117	   to extend addressing of devices.  Assuming we could limit the
118	   applications' port addressing to 8 (or 4) bits, we can increase the
119	   effective size of an IPv4 address by 8 (or 12) additional bits.  In
120	   this scenario, 128 (or 4096) customers could be multiplexed on the
121	   same IPv4 address, while allowing them a fixed range of 512 (or 16)
122	   ports.  Customers that require larger port-ranges could dynamically
123	   request additional blocks, depending on their contract.  We call this
124	   "extended addressing" or "A+P" (Address plus Port) addressing.  The
125	   main advantage of A+P is that it preserves the Internet "end-to-end"
126	   paradigm by not translating (at least some ports of) an IP address.
127	   With NAT in the core of the network, this end-to-end connectivity is
128	   broken.  As long as the customer chooses to do this on his/her
129	   premises this is a choice that he/she takes, however this is not an
130	   option in face of the looming IPv4 address exhaustion, where so
131	   called Carrier Grade NATs (CGNs) might be deployed within the
132	   providers network - beyond control of the customer.  CGNs come with
133	   different names and in different flavors, such as NAT444, Large Scale
134	   NATs (LSNs) or Address Family Transition Routers (AFTR).

136	1.1.  Why Carrier Grade NATs are Harmful

138	   Various forms of NATs will be installed at various levels and places
139	   in the IPv4-Internet to achieve address compression.  This document
140	   argues for mechanisms where this happens as close to the edge as
141	   possible, thereby minimizing damage to the End to End Principle.
142	   End-customers will not be locked into a walled-garden without any
143	   control over the translation.  It is is essential to create
144	   mechanisms to "bypass" NATs in the core, and keep the control at the
145	   end-user:

147	   "Carrier grade" is a euphemism for centralized.  More semantics move
148	   to the core of the network.  This is bad in and of itself.  Net-heads
149	   call it "telco-think" because it is the telco model of smarts in the
150	   core as opposed to the Internet model of a simple, just-forward-
151	   packets core, with smart edges.  It also places the provider in the
152	   position, where the user is trapped behind unchangeable application
153	   policies, and has the danger of invoking lawyers when users wish to
154	   deploy new applications needing Application Level Gateways (ALGs).
155	   This is the opposite of the "end-to-end" model of the Internet.

157	   With the smarts at the edges, one can easily field new protocols
158	   between consenting end-points by merely tweaking the NATs at the
159	   corresponding Customer Premises Equipment (CPE), even adding
160	   application layer gateways if they are needed.

162	   Today's NATs are typically mitigated by ALGs over which the customer
163	   has control, e.g. port forwarding or UPnP/NAT-PMP.  However, this is
164	   not expected to work with CGNs.  CGN proposals - other than DS-Lite
165	   [I-D.ietf-softwire-dual-stack-lite] with A+P - admit that it is not
166	   expected that applications that require specific port assignment or
167	   port mapping from the NAT box will keep working.  This is the
168	   ultimate horror the NAT-haters fear, and, in this case, they are not
169	   all that wrong.

171	   We believe this CGN approach is not an option and that the end-user
172	   must have the ability to control their own ALGs.  With CGN, if a user
173	   wishes to deploy a new application, they must talk to the providers'
174	   lawyers or run new disruptive technology over HTTP; we can pick our
175	   poison.  And if the NAT is not where the customer can directly
176	   control it, i.e., it is anywhere in the provider's network, then the
177	   provider controls what the user can control, i.e. it is not really
178	   under user control.  We do not wish to deal with the case where the
179	   provider has to decide whether to allow Skype v42 when they
180	   themselves provide a competing VoIP product.

182	   Another issue with CGN is scalability.  ISPs face a tension between
183	   the placement of CGNs within their network to aggregate as much as
184	   possible, when too much aggregation creates a massive state problem.
185	   CGNs also present a single point of failure.  And having a back-up
186	   CGN has the state transfer problem as well as exposure to network
187	   partition and dual-device failure.  When you start talking about
188	   'high reliability/availability, you have already lost the game.  The
189	   internet is about building a reliable network using unreliable
190	   devices.

192	   To reduce the state, NAT placement ends up as CGNs somewhere closer
193	   to the edge.  It is not clear how a CGN should maintain per-session
194	   state in a scalable manner.  State for improperly terminated sessions
195	   could remain stale for some time.  The CGN hence trades scalability
196	   for the amount of state that needs to be kept, which makes optimally
197	   placing a CGN a hard engineering problem.

199	   Furthermore, with CGN, tracing hackers, spammers and other criminals
200	   will be impossible, unless all the connection based mapping
201	   information is recorded and stored.  This would not only cause
202	   concern for law enforcement services, but also for privacy advocates.

204	2.  Design Constraints and Assumptions

206	   The problem of address space shortage is first felt by providers with
207	   a very large end-user customer base, such as broadband providers and
208	   mobile-service providers.  Though the cases and requirements are
209	   slightly different, they share many commonalities.  In the following
210	   we will develop a set of overall design constraints.

212	2.1.  Design constraints

214	   We regard several constraints as important for our design:

216	   1)      End-to-End is under customer control: Customers shall have
217	           the ability to deploy new application protocols at will.
218	           IPv4 address shortage should not be a license to break the
219	           Internet's end-to-end paradigm.

221	   2)      End-to-End transparency through multiple intermediate
222	           devices: Multiple gateways should be able to operate in
223	           sequence along one data path without interfering with each
224	           other.

226	   3)      Backward compatibility: Approaches should be transparent to
227	           unaware users.  Devices or existing applications should be
228	           able to work without modification.  Emergence of new
229	           applications should not be limited.

231	   4)      Incrementally deployable: The provider should not be forced
232	           to replace unaffected core devices or replace customer
233	           premises equipment (CPE).  In particular, the provider should
234	           be able to change only CPE where they wish to deploy A+P. And
235	           customers should be able to acquire A+P aware CPE at will.

237	   5)      Highly-scalable and minimal state core: Minimal state should
238	           be kept inside the ISP's network.  If the operator is rolling
239	           out A+P incrementally, it is understood there may be state in
240	           the core in the non-A+P part of such a roll-out.

242	   6)      Efficiency vs. complexity: Operators should have the
243	           flexibility to trade off port multiplexing efficiency and
244	           scalability and end-to-end transparency.

246	   7)      Automatic configuration/administration: There should be no
247	           need for customers to call the ISP and tell them that they
248	           are operating their own A+P-gateway devices.  Customers/
249	           mobile phone users should not be expected to look-up assigned
250	           ports manually on websites and then configure them on devices
251	           or applications.

253	   8)      "Double-NAT" should be avoided: Based on Constraint 2
254	           multiple gateway devices might be present in a path, and once
255	           one has done some translation, those packets should not be
256	           re-translated.

258	   9)      Legal traceability: ISPs must be able to provide the identity
259	           of a customer from the knowledge of the IPv4 public address
260	           and the port.  This should have as low an impact as is
261	           reasonable on storage by the ISP.  We assume that NATs on
262	           customer premises do not pose much of a problem, while
263	           provider NATs need to keep additional logs.

265	   10)     IPv6 deployment should be encouraged.  NAT444 strongly biases
266	           the users to the deployment of RFC 1918 addressing.  A+P
267	           should not.  While we acknowledge that A+P might be used in
268	           an IPv4-only environment (e.g., [I-D.boucadair-port-range])
269	           we strongly believe that IPv6 is the best long-term approach,
270	           and that A+P should be considered only as an intermediate
271	           hack towards an IPv6-only world.  We therefore prefer to
272	           assume in Constraint 10 that the ISP has migrated to a dual-
273	           stack core and A+P can use IPv6 as a transport inside the
274	           network.  This ensures that A+P will not be a hindrance to
275	           the introduction of IPv6.

277	   Constraints 2 and 8 are important: while many techniques have been
278	   deployed to allow applications to work through a NAT, traversing
279	   cascaded NATs is crucial if NATs are being deployed in the core of a
280	   provider network.

282	2.2.  Terminology

284	   The A+P architecture can be split into three distinct functions:
285	   encaps/decaps, NAT, and signaling.

287	   Encaps/decaps function: is used to forward port-restricted A+P-
288	   packets over intermediate legacy devices.  The encapsulation function
289	   takes an IPv4 packet, looks up the IP and TCP/UDP headers, and puts
290	   the packet into the appropriate tunnel.  The state needed to perform
291	   this action is comparable to a forwarding table.  The decapsulation
292	   device SHOULD check if the source address and port of packets coming
293	   out of the tunnel are legitimate (e.g., see [BCP38]).  Based on the
294	   result of such a check, the packet MAY be forwarded untranslated, it
295	   MAY be discarded or MAY be NATed.  In this draft we refer to a device
296	   that provides this encaps/decaps functionality as Port-Range-Router
297	   (PRR).

299	   Network Address Translation (NAT) function: is used to connect legacy
300	   end-hosts.  Unless upgraded, end-hosts or end-systems are not aware
301	   of A+P restrictions and therefore assume a full IP address.  The NAT
302	   function performs any address or port translation, including
303	   application-level-gateways (ALGs).  The state that has to be kept to
304	   implement this function is the mapping for which external addresses
305	   and ports have been mapped to which internal addresses and ports,
306	   just as in CPE NATs today.  A subtle, but very important, difference
307	   should be noted here: the customer has control over the NATing
308	   process or might choose to "bypass" the NAT.  If this is done, we
309	   call the NAT a large scale NAT (LSN).  However, if the NAT that does
310	   NOT allow the customer to control the translation process, we refer
311	   to as a CGN.

313	   Signaling function: is used in order to allow A+P-aware devices get
314	   to know which ports are assigned to be passed through untranslated
315	   and what will happen to packets outside the assigned port-range
316	   (e.g., could be NATed or discarded).  Signaling may also be used to
317	   learn the encapsulation method and any endpoint information needed.
318	   In addition, the signaling function may be used to dynamically
319	   increase/decrease the requested port-range.

321	   A+P address realm: a public routable IPv4 address that is port
322	   restricted (A+P).  Forwarding of packets is done based on the IPv4
323	   address and the TCP/UDP port numbers.  When this draft talks about
324	   "A+P packets" it is assumed that those packets pass untranslated.

326	   Private address realm: IPv4 addresses that are not globally routed.
327	   They may be taken from the [RFC1918] range.  However, this draft does
328	   not make such an assumption.  We regard as private address space any
329	   IPv4 address, which needs to be translated in order to gain global
330	   connectivity, irrespective of whether it falls in [RFC1918] space or
331	   not.

333	3.  Overview of the A+P Solution

335	   The core architectural elements of the A+P solution are three
336	   separated and independent functions: the NAT function, the encaps/
337	   decaps function, and the signaling function.  The NAT function is
338	   similar to a NAT as we know it today: it performs a translation
339	   between two different address realms.  When the external realm is
340	   public IPv4 address space, we assume that the translation is many-to-
341	   one, in order to multiplex many customers on a single public IPv4
342	   address.  The only difference with a traditional NAT (Figure 1) is
343	   that the translator might only be able to use a restricted range of
344	   ports when mapping multiple internal addresses onto an external one,
345	   e.g., the external address realm might be port-restricted.

347	                    "internal-side"          "external-side"
348	                                   +-----+
349	                      internal     |  N  |     external
350	                      address  <---|  A  |---> address
351	                       realm       |  T  |      realm
352	                                   +-----+

354	                              Traditional NAT

356	                                 Figure 1

358	   The encaps/decaps function, on the other hand, is the ability to
359	   establish a tunnel with another end-point providing the same
360	   function.  This implies some form of signaling to establish a tunnel.
361	   Such signaling can be viewed as integrated with DHCP or as a separate
362	   service.  Section 3.1 discusses the constraints of this signaling
363	   function.  The tunnel can be an IPv6 or IPv4encapsulation, a layer-2
364	   tunnel, or some other form of softwire.  Note that the presence of a
365	   tunnel allows unmodified, naive, or even legacy devices between the
366	   two endpoints.

368	   Two or more devices which provide the encaps/decaps function and are
369	   linked by tunnels to form an A+P subsystem.  The function of each
370	   gateway is to encapsulate and decapsulate respectively.  Figure 2
371	   depicts the simplest possible A+P subsystem, that is, two devices
372	   providing the encaps/decaps function.

374	                      +------------------------------------+
375	     port-restricted  | +----------+  tunnel  +----------+ |   external
376	      address realm --|-| gateway  |==========| gateway  |-|-- address
377	                      | +----------+          +----------+ |    realm
378	                      +------------------------------------+
379	                                  A+P subsystem

381	                          A simple A+P subsystem

383	                                 Figure 2

385	   Within an A+P subsystem, the external address realm is extended by
386	   "stealing" bits from the port number.  Each device is assigned one
387	   address from the external realm and a range of port numbers.  Hence,
388	   devices which are part of an A+P subsystem can communicate with the
389	   external address without the need for address translation (i.e.,
390	   preserving end-to-end packet integrity): an A+P packet originated
391	   from within the A+P subsystem can be simply forwarded over tunnels up
392	   to the endpoint, where it gets decapsulated and routed in the
393	   external realm.

395	3.1.  Signaling

397	   The following information needs to be available on all the gateways
398	   in the A+P subsystem.  It is expected that there will be a signaling
399	   protocol such as [I-D.bajko-pripaddrassign],
400	   [I-D.boucadair-dhcpv6-shared-address-option], or
401	   [I-D.boucadair-pppext-portrange-option].  The information that needs
402	   to be shared is the following:

404	   o  a set of public IPv4 addresses,

406	   o  for each IPv4 address a starting point for the allocated port-
407	      range,

409	   o  number of delegated ports,

411	   o  optional key that enables partial or full preservation of entropy
412	      in port randomization - see [I-D.bajko-pripaddrassign],

414	   o  lifetime for each IPv4 address and allocated port-set,

416	   o  the tunneling technology to be used (e.g., "IPv6-encapsulation")

418	   o  addresses of the tunnel endpoints (e.g., IPv6 address of tunnel
419	      endpoints)

421	   o  whether or not NAT function is provided by the gateway

423	   o  a device identification number and some authentication mechanisms

425	   o  a version number and some reserved bits for future use.

427	   Note that the functions of encapsulation and decapsulation have been
428	   separated from the NAT function.  However, to accommodate legacy
429	   hosts, NATing is likely to be provided at some point in the path;
430	   therefore the availability or absence of NATing MUST be communicated
431	   in signaling, as A+P is agnostic about NAT placement.

433	   The port-ranges can be allocated in two different ways:

435	   o  If applications or end-hosts behind the CPE are not UPnPv2/NAT-PMP
436	      aware, then the CPE SHOULD request ports via mechanisms, e.g. as
437	      described in [I-D.bajko-pripaddrassign] and
438	      [I-D.boucadair-pppext-portrange-option].  Note that different
439	      port-ranges can have different lifetimes, and the CPE is not
440	      entitled to use them after they expire - unless it refreshes those
441	      ranges.  It is up to the ISP to put mechanisms in place, that
442	      determine what percentage of already allocated port-ranges should
443	      be exhausted before a CPE may requests additional ranges, how
444	      often the CPE can request additional ranges, and so on.  (To
445	      prevent Denial of Service attacks.)

447	   o  If applications behind the CPE are UPnPv2/NAT-PMP aware additional
448	      ports MAY be requested through that mechanism.  In this case the
449	      CPE should forward those requests to the LSN and the LSN should
450	      reply reporting if the requested ports are available or not (and
451	      if they are not available some alternatives should be offered).
452	      Here again, to prevent potential denial of service attacks,
453	      mechanism should be in place to prevent UPnPv2/NAT-PMP packet
454	      storms and fast port allocation.

456	   Whatever signaling mechanism is used inside the tunnels, DHCP or IPCP
457	   based, synchronization between signaling server and PRR must be
458	   established in both directions.  For example, if we use DHCP as
459	   signaling mechanism, the PRR must communicate to DHCP server at least
460	   its IP range.  The DHCP server then starts to allocate IPs and port-
461	   ranges to CPEs and communicates back to the PRR which IP and port
462	   range have been allocated to which CPE, so the PRR knows to which
463	   tunnel redirect incoming traffic.  In addition, DHCP MUST also
464	   communicate lifetimes of port-ranges assigned to CPE via the PRR.

466	   If UPnPv2/NAT-PMP is used as dynamic port allocation mechanism, the
467	   PRR must also communicate to the DHCP (or IPCP) server to avoid those
468	   ports.  The PRR must somehow (DHCP or IPCP options) communicate back
469	   to CPE that allocation of ports was successful, so CPE adds those
470	   ports to existing port-ranges.

472	3.2.  Address realm

474	   Each gateway within the A+P subsystem manages a certain portion of
475	   A+P address space, that is, a portion of IPv4 space which is extended
476	   by borrowing bits from the port number.  This address space may be a
477	   single, port-restricted IPv4 address.  The gateway MAY use its
478	   managed A+P address space for several purposes:

480	   o  Allocation of a sub-portion of the A+P address space to other
481	      authenticated A+P gateways in the A+P subsystem (referred to as
482	      delegation).  We call the allocated sub-portion delegated address
483	      space.

485	   o  Exchange of (untranslated) packets with the external address
486	      realm.  For this to work, such packets MUST use source address and
487	      port belonging to the non-delegated address space.

489	   If the gateway is also capable of performing the NAT function, it MAY
490	   translate packets arriving on an internal interface which are outside
491	   of its managed A+P address space into non-delegated address space.

493	   Hence, a provider may have 'islands' of A+P as they slowly deploy
494	   over time.  The provider does not have to replace CPE until they want
495	   to provide the A+P function to an island of users or even to one
496	   particular user in a sea of non-A+P users.

498	   An A+P gateway ("A"), accepts incoming connections from other A+P
499	   gateways ("B").  Upon connection establishment (provided appropriate
500	   authentication), B would "ask" A for delegation of an A+P address.
501	   In turn, A will inform B about its public IPv4 address, and will
502	   delegate a portion of its port-range to B. In addition, A will also
503	   negotiate the encaps/decaps function with B (e.g., let B know the
504	   address of the decaps device/other-end-point of the tunnel).

506	   This could be implemented for example via a NAT-PMP or DHCP-like
507	   solution.  In general the following rule applies: A sub-portion of
508	   the managed A+P address space is delegated as long as devices below
509	   ask for it, otherwise private IPv4 is provided to support legacy
510	   hosts.

512	              private    +-----+          +-----+     public
513	              address ---|  B  |==========|  A  |---  Internet
514	               realm     +-----+          +-----+

516	                         Address space realm of A:
517	                         public IPv4 address = 12.0.0.1
518	                         port range = 0-65535

520	                         Address space realm of B:
521	                         public IPv4 address = 12.0.0.1
522	                         port range = 2560-3071

524	                                 Figure 3

526	   Figure 3 illustrates a sample configuration.  Note that A might
527	   actually consist of three different devices: one that handles
528	   signaling requests from B; one device that performs encapsulation and
529	   decapsulation; and, if provided, one device that performs NATing
530	   function (e.g., LSN).  Packet forwarding is assumed to be as follows:
531	   In the "out-bound" case, a packet arrives from the private address
532	   realm to B. As stated above, B has two options: it can either apply
533	   or not apply the NAT function.  The decision depends upon the
534	   specific configuration and/or the capabilities of A and B. Note that
535	   NAT functionality is required to support legacy hosts, however, this
536	   can be done at either of the two devices A or B. The term NAT refers
537	   to translating the packet into the managed A+P address (B has address
538	   12.0.0.1 and ports 2560-3071 in the example above).  We then have two
539	   options:

541	   1)  B NATs the packet.  The translated packet is then tunneled to A.
542	       A recognizes that the packet has already been translated, because
543	       the source address and port match the delegated space.  A
544	       decapsulates the packet and releases it in the public Internet.

546	   2)  B does not NAT the packet.  The untranslated packet is then
547	       tunneled to A. A recognizes that the packet has not been
548	       translated, so A forwards the packet to a co-located NATing
549	       device, which translates the packet and routes it in the public
550	       Internet.  This device, e.g., an LSN, has to store the mapping
551	       between the source port used to NAT and the tunnel where the
552	       packet came from, in order to correctly route the reply.  Note
553	       that A cannot use a port number from the range that has been
554	       delegated to B. As a consequence A has to assign a part of its
555	       non-delegated address space to the NATing function.

557	   "Inbound" packets are handled in the following way: a packet from the
558	   public realm arrives at A. A analyzes the destination port number to
559	   understand whether the packet needs to be NATed or not.

561	   1)  If the destination port number belongs to the range that A
562	       delegated to B, then A tunnels the packet to B. B NATs the packet
563	       using its stored mapping and forwards the translated packet to
564	       the private domain.

566	   2)  If the destination port number is from the address space of the
567	       LSN, then A passes the packet on to the co-located LSN which uses
568	       its stored mapping to NAT the packet into the private address
569	       realm of B. The appropriate tunnel is stored as well in the
570	       mapping of the initial NAT.  The LSN then encapsulates the packet
571	       to B, which decapsulates it and normally routes it within its
572	       private realm.

574	   3)  Finally, if the destination port number neither falls in a
575	       delegated range, nor into the address range of the LSN, A
576	       discards the packet.  If the packet is passed to the LSN, but no
577	       mapping can be found, the LSN discards the packet.

579	3.3.  Reasons for allowing multiple A+P gateways

581	   Since each device in an A+P subsystem provides the encaps/decaps
582	   function, new devices can establish tunnels and become in turn part
583	   of an A+P subsystem.  As noted above, being part of an A+P subsystem
584	   implies the capability of talking to the external address realm
585	   without any translation.  In particular, as described in the previous
586	   section, a device X in an A+P subsystem can be reached from the
587	   external domain by simply using the public IPv4 address and a port
588	   which has been delegated to X. Figure 4 shows an example where three
589	   devices are connected in a chain.  In other words, A+P signaling can
590	   be used to extend end-to-end connectivity to the devices which are in
591	   an A+P subsystem.  This allows A+P-aware applications (or OSes)
592	   running on end hosts to enter an A+P subsystem and exploit
593	   untranslated connectivity.

595	   There are two modes for end-hosts to gain fine-grained control of
596	   end-to-end connectivity.  The first is where actual end-hosts perform
597	   the NAT function and the encaps/decaps function which is required to
598	   join the A+P subsystem.  This option works in a similar way to the
599	   NAT-in-the-host trick employed by virtualization software such as
600	   VMware, where the guest operating system is connected via a NAT to
601	   the host operating system.  The second mode is applications which
602	   autonomously ask for an A+P address and use it to join the A+P
603	   subsystem.  This capability is necessary for some applications that
604	   require end-to-end connectivity (e.g., applications that need to be
605	   contacted from outside).

607	               +---------+      +---------+      +---------+
608	     internal  | gateway |      | gateway |      | gateway |  external
609	     realm   --|    1    |======|    2    |======|    3    |-- realm
610	               +---------+      +---------+      +---------+

612	                  An A+P subsystem with multiple devices

614	                                 Figure 4

616	   Whatever the reasons might be, the Internet was built on a paradigm
617	   that end-to-end connectivity is important.  A+P makes this still
618	   possible in a time where address shortage forces ISPs to use NATs at
619	   various levels.  In such sense, A+P can be regarded as a way to
620	   bypass NATs.

622	              +---+          (customer2)
623	              |A+P|-.         +---+
624	              +---+  \     NAT|A+P|-.
625	                      \       +---+ |
626	                       \            |       forward if in-range
627	              +---+     \+---+    +---+    /
628	              |A+P|------|A+P|----|A+P|----
629	              +---+     /+---+    +---+    \
630	                       /                    NAT if necessary
631	                      / (cust1)   (prov.    (e.g., provider NAT)
632	              +---+  /            router)
633	              |A+P|-'
634	              +---+

636	                          A complex A+P subsystem

638	                                 Figure 5

640	   Figure 5 depicts a complex scenario, where the A+P subsystem is
641	   composed by multiple devices organized in a hierarchy.  Each A+P
642	   gateway decapsulates the packet and then re-encapsulates it again to
643	   the next tunnel.

645	   A packet can either be NATed when it enters the A+P subsystem, or at
646	   intermediate devices, or when it exits the A+P subsystem.  This could
647	   be for example a gateway installed within the provider's network,
648	   together with a LSN.  Then each customer operates its own CPE.
649	   However, behind the CPE applications might also be A+P-aware and run
650	   their own A+P-gateways, which enables them to have end-to-end
651	   connectivity.

653	   One limitation applies, if "delayed translation" is used (e.g.,
654	   translation at the LSN instead of the CPE).  If devices using
655	   "delayed translation" want to talk to each other they SHOULD use A+P
656	   addresses or out-of-band addressing.

658	4.  Deployment Scenarios

660	4.1.  A+P for Broadband Providers

662	   Large broadband providers do not have enough IPv4 address space to
663	   provide every customer with a single IP.  The natural solution is
664	   sharing a single IP address among many customers.  Multiplexing
665	   customers is usually accomplished by allocating different port
666	   numbers to different customers somewhere within the network of the
667	   provider.

669	   In this document we use the following terms and assumptions:

671	   1.  Customer Premises Equipment (CPE), i.e. cable/DSL modem.

673	   2.  Provider Edge Router (PE), AKA customer aggregation router

675	   3.  Port Range Router (PRR), edge behind which A+P addresses are
676	       used.

678	   4.  Provider Border Router (BR), providers edge to other providers

680	   5.  Network Core Routers (Core), provider routers which are not at
681	       the edge.

683	   It is expected that, when the provider wishes to enable A+P for a
684	   customer or a range of customers, the CPE can be upgraded or replaced
685	   to support A+P encaps/decaps functionality.  Ideally the CPE also
686	   provides NATing functionality.  Further, it is expected that at least
687	   another component in the ISP network provides the corresponding A+P
688	   functionality, and hence is able to establish an A+P subsystem with
689	   the CPE.  This device is referred to as A+P router or port-range
690	   router (PRR), and could be located close to PE routers.  The core of
691	   the network MUST support the tunneling protocol (which SHOULD be
692	   IPv6, as per Constraint 10) but MAY be another tunneling technology
693	   when necessary.  In addition, we do not wish to restrict any
694	   initiative of customers who might want to run an A+P-capable network
695	   on or behind their CPE.  To satisfy both Constraints 1 and 3
696	   unmodified legacy hosts should keep working seamlessly, while
697	   upgraded/new end-systems should be given the opportunity to exploit
698	   enhanced features.

700	4.2.  A+P for Mobile Providers

702	   In the case of mobile service provider the situation is slightly
703	   different.  The A+P border is assumed to be the gateway (e.g., GGSN/
704	   PDN GW of 3GPP, or ASN GW of WiMAX).  The need to extend the address
705	   is not within the provider network, but on the edge between the
706	   mobile phone devices and the gateway.  While desirable, IPv6
707	   connectivity may or may not be provided.

709	   For mobile providers we use the following terms and assumptions:

711	   1.  Provider Network (PN)

713	   2.  Gateway (GW)

715	   3.  Mobile Phone device (phone)

717	   4.  Devices behind phone, e.g., laptop computer connecting via phone
718	       to Internet.

720	   We expect that the gateway has a pool of IPv4 addresses and is always
721	   in the data-path of the packets.  Transport between the gateway and
722	   phone devices is assumed to be an end-to-end layer-2 tunnel.  We
723	   assume that phone as well as gateway can be upgraded to support A+P.
724	   However, some applications running on the phone or devices behind the
725	   phone (such as laptop computers connecting via the phone), are not
726	   expected to be upgraded.  Again, while we do not expect that devices
727	   behind the phone will be A+P aware/upgraded we also do not want to
728	   hinder their evolution.  In this sense the mobile phone would be
729	   comparable to the CPE in the broadband provider case; the gateway to
730	   the PRR/LSN box in the network of the broadband provider.

732	4.3.  A+P from the provider network perspective

734	   ISPs suffering from IPv4 address space exhaustion are interested in
735	   achieving a high address space compression ratio.  In this respect,
736	   an A+P subsystem allows much more flexibility than traditional NATs:
737	   the NAT can be placed at the customer, and/or in the provider
738	   network.  In addition hosts or applications can request ports and
739	   thus have untranslated end-to-end connectivity.

741	                   +---------------------------+
742	        private    | +------+  A+P-in  +-----+ |   dual-stacked
743	       (RFC1918) --|-| CPE  |==-IPv6-==| PRR |-|-- network
744	         space     | +------+  tunnel  +-----+ |   (public addresses)
745	                   |    ^              +-----+ |
746	                   |    |  IPv6-only   | LSN | |
747	                   |    |   network    +-----+ |
748	                   +----+----------------- ^ --+
749	                        |                  |
750	                   on customer        within provider
751	              premises and control      network

753	                      A simple A+P subsystem example

755	                                 Figure 6

757	   Consider the deployment scenario in Figure 6, where an A+P subsystem
758	   is formed by the CPE and a port-range router (PRR) within the ISP
759	   core network, preferably close to the customer edge, and represents
760	   the border from where on packets are forwarded based on address and
761	   port.  The provider MAY deploy a LSN co-located with the PRR to
762	   handle packets that have not been translated by the CPE.  In such a
763	   configuration, the ISP allows the customer to freely decide whether
764	   the translation is done at the CPE or at the LSN.  In order to
765	   establish the A+P subsystem, the CPE will be configured automatically
766	   (e.g. via a signaling protocol, that conforms to the requirements
767	   stated above).

769	   Note that the CPE in the example above is only provisioned with an
770	   IPv6 address on the external interface.

772	    +------------ IPv6-only transport ------------+
773	    | +---------------+ |              |          |
774	    | |A+P-application| |  +--------+  |  +-----+ |   dual-stacked
775	    | | on end-host   |=|==| CPE w/ |==|==| PRR |-|-- network
776	    | +---------------+ |  +--------+  |  +-----+ |   (public addresses)
777	    +---------------+   |  +--------+  |  +-----+ |
778	      private IPv4 <-*--+->| NAT    |  |  | LSN | |
779	      address space   \ |  +--------+  |  +-----+ |
780	      for legacy       +|--------------|----------+
781	        hosts           |              |
782	                        |              |
783	      end-host with     |  CPE device  |  provider
784	        upgraded        |  on customer |  network
785	       application      |   premises   |

787	         An extended A+P subsystem with end-host running A+P-aware
788	                               applications

790	                                 Figure 7

792	   Figure 7 shows an example of how an upgraded application running on a
793	   legacy end-host can connect.  The legacy host is provisioned with a
794	   private IPv4 address allocated by the CPE.  Any packet sent from the
795	   legacy host will be NATed either at the CPE (if configured to do so),
796	   or at the LSN (if available).

798	   An A+P-aware application running on the end-host MAY use the
799	   signaling described in Section 3.1 to connect to the A+P-subsystem.
800	   In this case, the application will be delegated some space in the A+P
801	   address realm, and will be able to contact the external realm (i.e.,
802	   the public Internet) without the need for translation.

804	   Note that part of A+P signaling is that the NATs are optional.
805	   However, if neither the CPE nor the PRR provides NATing
806	   functionality, then it will not be possible to connect legacy end-
807	   hosts.

809	   To enable packet forwarding with A+P, the ISP MUST install at its A+P
810	   border a PRR which encaps/decaps packets.  However, to achieve a
811	   higher address space compression ratio and/or to support CPEs without
812	   NATing functionality, the ISP MAY decide to provide an LSN as well.
813	   If no LSN is installed in some part of the ISP's topology, all CPE in
814	   that part of the topology MUST support NAT functionality.  For
815	   reasons of scalability, it is assumed that the PRR is located within
816	   the access-portion of the network.  The CPE would be configured
817	   automatically (e.g. via an extended DHCP or NAT-PMP, which has the
818	   signaling requirements stated above) with the address of the PRR, and
819	   if a LSN is being provided or not.  Figure 6 illustrates a possible
820	   deployment scenario.

822	4.4.  Dynamic allocation of port ranges

824	   Allocating a fixed number of ports to all CPE may lead to exhaustion
825	   of ports for high usage customers.  This is a perfect recipe for
826	   upsetting more demanding customers.  On the other hand, allocating to
827	   all customers ports sufficient to match the needs of peak users will
828	   not be very efficient.  A mechanism for dynamic allocation of port
829	   ranges allows the ISP to achieve two goals; a more efficient
830	   compression ratio of number of customers on one IPv4 address and, on
831	   the other hand, not limiting the more demanding customers'
832	   communication.

834	   Additional allocation of ports, or port ranges may be made after an
835	   initial static allocation of ports.

837	   The following mechanism applies to NAT functionality in CPE only: If
838	   a customer has an arrangement with the ISP for well-known ports, and
839	   the PRR allocates to this CPE WKP range, this range may be used for
840	   end-to-end communications to a server behind CPE with public IP
841	   address or if customer configures so for inbound NAT (1:1 or port
842	   forwarding).  This function has a fixed range of ports and is not
843	   considered in the dynamic pool allocation mechanism.  On the other
844	   hand, if customer configures the NAT function to access the Internet
845	   from a private address pool behind the CPE, this mechanism is
846	   automatically applied.  NAT keeps track of translation tables, so
847	   only a small "daemon" needs to be developed and implemented by the
848	   CPE manufacturer to keep track of allocated ranges of ports and how
849	   many are used.  In the case of 90% usage, the dynamic allocation
850	   daemon could signal to the PRR the need for additional ports.  A
851	   downside of this mechanism is that port allocation to a CPE might get
852	   quite large without an additional mechanism that would return unused
853	   port ranges back to the PRR's pool.  This may be dealt with by
854	   requiring the NAT to sequentially allocate ports for translation and
855	   reallocate to new requests and released ports.  So the use of ports
856	   is controlled and unfragmented ranges may be returned to pool.  An
857	   other, not so pretty, way is to reset the additional allocations to 0
858	   every 24 hours, and leave only the first allocation.  Additional
859	   allocations would be requested by mechanism in a very short time,
860	   leaving the customer unlikely to notice the event.

862	   The mechanism would prefer allocations of port ranges from the same
863	   IP address as the initial allocation.  If it is not possible to
864	   allocate an additional port range from the same IP, then mechanism
865	   can allocate a port range from another IP within the same subnet.
866	   With every additional port range allocation, the PRR updates its
867	   routing table.  The mechanism for allocating additional port ranges
868	   may be part of normal signaling that is used to authenticate CPE to
869	   ISP.

871	   The ISP controls the dynamic allocation of port ranges by the PRR by
872	   setting the initial allocation size and maximum number of allocations
873	   per CPE, or the maximum allocations per subscription, depending on
874	   subscription level.  There is a general observation that the more
875	   demanding customer uses around 1024 ports when heavily communicating.
876	   So, for example, a first suggestion might be 128 ports initially and
877	   then dynamic allocations of ranges of 128 ports up to 511 more
878	   allocations maximum.  A configured maximum number of allocations
879	   could be used to prevent one customer acting in distructive manner
880	   should they become infected.  The maximum number of allocations might
881	   also be more finely grained, with parameters of how many allocations
882	   a user may request per some time frame.  If this is used, evasive
883	   applications may need to be limited in their bad behavior, for
884	   example one additional allocation per minute would considerably slow
885	   a port request storm.

887	   There is likely no minimum request size.  This is because A+P-aware
888	   applications running on end-hosts MAY request a single port (or a few
889	   ports) for the CPE to be contacted on (e.g., VoIP clients register a
890	   public IP and a single delegated port from the CPE, and accept
891	   incoming calls on that port).  The implementation on the CPE or PRR
892	   will dictate how to handle such requests for smaller blocks: For
893	   example, half of available blocks might be used for "block-
894	   allocations", 1/6 for single port requests, and the rest for NATing.

896	   Another possible mechanism to allocate additional ports is UPnP/
897	   NAT-PMP (as defined in Section 3.1), if applications behind CPE
898	   support it.  In case of the LSN implementation (DS-Lite), as
899	   described in the A+P overall architecture section, signaling packets
900	   are simply forwarded by the CPE to the LSN and back to the host
901	   running the application which requested the ports, and PRR allocates
902	   requested port to appropriate CPE.  The same behavior may be chosen
903	   with AFTR, if requested ports are outside of static initial port
904	   allocation.  If a full A+P implementation is selected, than UPnPv2/
905	   NAT-PMP packets are accepted by the CPE, processed, and the requested
906	   port number is communicated through normal signaling mechanism
907	   between CPE and PRR tunnel endpoints (DHCP or IPCP).

909	4.5.  Overall A+P architecture

911	                           A+P architecture

913	         IPv4         Full-A+P          AFTR             CGN
914	          |              |               |                |
915	   <-- Full IPv4 ---- Port range ---- Port range  ---- Provider --->
916	       allocated      & dynamic         & LSN          NAT ONLY
917	                      allocation      (NAT on CPE      (No mechanism)
918	       (no NAT)      (NAT on CPE)     and on LSN)      for customer to
919	                                                       bypass CGN)

921	                    Figure 8: A+P overall architecture

923	   The A+P architecture defines various options to be deployed within an
924	   ISP.  Figure 8 shows the spectrum of deployment options.  On the far
925	   left today's status-quo, an IPv4 address unrestricted with full port-
926	   range.  Full-A+P, refers to a port-range allocation from the ISP.
927	   The customer must operate A+P-aware devices and no NATing
928	   functionality is provided by the ISP.  AFTR, such as DS-Lite
929	   [I-D.ietf-softwire-dual-stack-lite], is a hybrid.  There is NAT
930	   present in the core (in this draft referred to as LSN), but the user
931	   has the option to "bypass" that NAT in one form or an other, for
932	   example via A+P, NAT-PMP, etc...  Finally, a provider only CGN, will
933	   place a NAT in the providers core and does not allow the customer to
934	   "bypass" the translation process or modify ALGs on the NAT.  The
935	   customer is provider-locked.  Note as well that all options (besides
936	   full IPv4) require some form of tunneling mechanism (e.g., 4in6) and
937	   a signaling mechanism (see Section 3.1).

939	4.6.  Example of A+P-forwarded packets

941	   This section provides a detailed example of A+P setup, configuration,
942	   and packet flow from an end-host behind an A+P upgraded provider to
943	   any host in the IPv4 Internet, and how the return packets flow back.
944	   The following example discusses an A+P-unaware end-host, where the
945	   NATing is done at the CPE.  Figure 9 illustrates how the CPE receives
946	   an IPv4 packet from the end-user device.  We first describe the case
947	   where the CPE has been configured to provide the NAT functionality
948	   (e.g., by the customer via interaction via a website, or via
949	   automatic signaling).  In the following, we call a packet which is
950	   translated at the CPE an A+P-forwarded packet, an analogy with the
951	   port-forwarding function employed in today's CPEs.  Upon receiving a
952	   packet from the internal interface, the CPE NATs it and forwards it
953	   to the PRR.  The NAT on the CPE is assumed to store the 5-tuple
954	   (source_IPv4, source_port, destination_IPv4, destination_port,
955	   tunnel-interface).

957	   When the PRR receives the A+P-forwarded packet, it de-capsulates the
958	   inner IPv4 packet and it checks the source address and port.  If the
959	   source address and port match the CPE's A+P address, then the PRR
960	   simply forwards the decapsulated packet onward.  This is always the
961	   case for A+P-forwarded packets.  Otherwise, the PRR assumes that the
962	   packet is not A+P-forwarded, sl passes it to the LSN function, which
963	   in-turn NATs the packet and then releases it into the Internet.
964	   Figure 9 shows the packet flow for an outgoing A+P-forwarded packet.

966	                   +-----------+
967	                   |    Host   |
968	                   +-----+-----+
969	                      |  |  10.0.0.2
970	      IPv4 datagram 1 |  |
971	                      |  |
972	                      v  |  10.0.0.1
973	               +---------|---------+
974	               |CPE      |         |
975	               +--------|||--------+
976	                      | |||     a::2
977	                      | ||| 12.0.0.3 (100-200)
978	       IPv6 datagram 2| |||
979	                      | |||<-IPv4-in-IPv6
980	                      | |||
981	                 -----|-|||-------
982	               /      | |||        \
983	              |  ISP access network |
984	               \      | |||        /
985	                 -----|-|||-------
986	                      | |||
987	                      v |||     a::1
988	               +--------|||--------+
989	               |PRR     |||        |
990	               +---------|---------+
991	                      |  |  12.0.0.1
992	      IPv4 datagram 3 |  |
993	                 -----|--|--------
994	               /      |  |         \
995	              |   ISP network /     |
996	               \      Internet     /
997	                 -----|--|--------
998	                      |  |
999	                      v  | 128.0.0.1
1000	                   +-----+-----+
1001	                   | IPv4 Host |
1002	                   +-----------+

1004	          Figure 9: Forwarding of Outgoing A+P-forwarded Packets

1006	     +-----------------+--------------+-----------------------------+
1007	     |        Datagram | Header field | Contents                    |
1008	     +-----------------+--------------+-----------------------------+
1009	     | IPv4 datagram 1 |     IPv4 Dst | 128.0.0.1                   |
1010	     |                 |     IPv4 Src | 10.0.0.2                    |
1011	     |                 |      TCP Dst | 80                          |
1012	     |                 |      TCP Src | 8000                        |
1013	     | --------------- | ------------ | --------------------------- |
1014	     | IPv6 Datagram 2 |     IPv6 Dst | a::1                        |
1015	     |                 |     IPv6 Src | a::2                        |
1016	     |                 |     IPv4 Dst | 128.0.0.1                   |
1017	     |                 |     IPv4 Src | 12.0.0.3                    |
1018	     |                 |      TCP Dst | 80                          |
1019	     |                 |      TCP Src | 100                         |
1020	     | --------------- | ------------ | --------------------------- |
1021	     | IPv4 datagram 3 |     IPv4 Dst | 128.0.0.1                   |
1022	     |                 |     IPv4 Src | 12.0.0.3                    |
1023	     |                 |      TCP Dst | 80                          |
1024	     |                 |      TCP Src | 100                         |
1025	     +-----------------+--------------+-----------------------------+

1027	                         Datagram header contents

1029	   An incoming packet undergoes the reverse process.  When the PRR
1030	   receives an IPv4 packet on an external interface, it first checks
1031	   whether the destination port number falls in a delegated range or
1032	   not.  If the address space was delegated, then PRR encapsulates the
1033	   incoming packet and forwards it through the appropriate tunnel for
1034	   that IP/port range.  If the address space was not-delegated the
1035	   packet would be handed to the LSN to check if a mapping is available.

1037	   Figure 10 shows how an incoming packet is forwarded, under the
1038	   assumption that the port number matches the port range which was
1039	   delegated to the CPE.

1041	                   +-----------+
1042	                   |    Host   |
1043	                   +-----+-----+
1044	                      ^  |  10.0.0.2
1045	      IPv4 datagram 3 |  |
1046	                      |  |
1047	                      |  |  10.0.0.1
1048	               +---------|---------+
1049	               |CPE      |         |
1050	               +--------|||--------+
1051	                      ^ |||     a::2
1052	                      | ||| 12.0.0.3 (100-200)
1053	       IPv6 datagram 2| |||
1054	                      | |||<-IPv4-in-IPv6
1055	                      | |||
1056	                 -----|-|||-------
1057	               /      | |||        \
1058	              | ISP access network  |
1059	               \      | |||        /
1060	                 -----|-|||-------
1061	                      | |||
1062	                      | |||     a::1
1063	               +--------|||--------+
1064	               |PRR     |||        |
1065	               +---------|---------+
1066	                      ^  |  12.0.0.1
1067	      IPv4 datagram 1 |  |
1068	                 -----|--|--------
1069	               /      |  |         \
1070	              |  ISP network /      |
1071	               \      Internet     /
1072	                 -----|--|--------
1073	                      |  |
1074	                      |  | 128.0.0.1
1075	                   +-----+-----+
1076	                   | IPv4 Host |
1077	                   +-----------+

1079	          Figure 10: Forwarding of Incoming A+P-forwarded Packets

1081	     +-----------------+--------------+-----------------------------+
1082	     |        Datagram | Header field | Contents                    |
1083	     +-----------------+--------------+-----------------------------+
1084	     | IPv4 datagram 1 |     IPv4 Dst | 12.0.0.3                    |
1085	     |                 |     IPv4 Src | 128.0.0.1                   |
1086	     |                 |      TCP Dst | 100                         |
1087	     |                 |      TCP Src | 80                          |
1088	     | --------------- | ------------ | --------------------------- |
1089	     | IPv6 Datagram 2 |     IPv6 Dst | a::2                        |
1090	     |                 |     IPv6 Src | a::1                        |
1091	     |                 |     IPv4 Dst | 12.0.0.3                    |
1092	     |                 |       IP Src | 128.0.0.1                   |
1093	     |                 |      TCP Dst | 100                         |
1094	     |                 |      TCP Src | 80                          |
1095	     | --------------- | ------------ | --------------------------- |
1096	     | IPv4 datagram 3 |     IPv4 Dst | 10.0.0.2                    |
1097	     |                 |     IPv4 Src | 128.0.0.1                   |
1098	     |                 |      TCP Dst | 8000                        |
1099	     |                 |      TCP Src | 80                          |
1100	     +-----------------+--------------+-----------------------------+

1102	                         Datagram header contents

1104	   Note that datagram 1 travels untranslated up to the CPE, thus the
1105	   customer has the same control over the translation as it has today
1106	   where s/he has an home gateway with customizable port-forwarding.

1108	4.7.  Forwarding of standard packets

1110	   Packets for which the CPE does not have a corresponding port
1111	   forwarding rule are tunneled to the PRR which provides the LSN
1112	   function.  We underline that the LSN MUST NOT use the delegated space
1113	   for NATting.  See [I-D.ietf-softwire-dual-stack-lite] for network
1114	   diagrams which illustrate the packet flow in this case.

1116	4.8.  Handling ICMP

1118	   ICMP is problematic for all NATs, because it lacks port numbers.  A+P
1119	   routing exacerbates the problem.

1121	   Most ICMP messages fall into one of two categories: error reports, or
1122	   ECHO/ECHO reply (commonly known as "ping").  For error reports, the
1123	   offending packet header is embedded within the ICMP packet; NAT
1124	   devices can then rewrite that portion and route the packet to the
1125	   actual destination host.  This functionality will remain the same
1126	   with A+P; however, the PRR will need to examine the embedded header
1127	   to extract the port number, while the A+P gateway will do the
1128	   necessary rewriting.

1130	   ECHO and ECHO reply are more problematic.  For ECHO, the A+P gateway
1131	   device must rewrite the "Identifier" and perhaps "Sequence Number"
1132	   fields in the ICMP request, treating them as if they were port
1133	   numbers.  This way, the PRR can build the correct A+P address for the
1134	   returning ECHO replies, so they can be correctly routed back to the
1135	   appropriate host in the same way as TCP/UDP packets.  (Pings
1136	   originated from an external domain/legacy Internet towards an A+P
1137	   device are not supported.)

1139	4.9.  Limitations of the A+P approach

1141	   One limitation that A+P shares with any other IP address-sharing
1142	   mechanism is the availability of well-known ports.  In fact, services
1143	   run by customers that share the same IP address will be distinguished
1144	   by the port number.  As a consequence, it will be impossible for two
1145	   customers who share the same IP address to run services on the same
1146	   port (e.g., port 80).  Unfortunately, working around this limitation
1147	   usually implies application-specific hacks (e.g., HTTP and HTTPS
1148	   redirection), discussion of which is out of the scope of this
1149	   document.  Of course, a provider might charge more for giving a
1150	   customer the well-known port range, 0..1024, thus allowing the
1151	   customer to provide externally available services.  Many applications
1152	   require the availability of well known ports.  However, those
1153	   applications are not expected to work in A+P environment unless they
1154	   can adapt to work with different ports.  However, such application do
1155	   not work behind today's NATs either.

1157	   Another problem which is common to all NATs is coexistence with
1158	   IPsec.  In fact, a NAT which also translates port numbers prevents AH
1159	   and ESP from functioning properly, both in tunnel and in transport
1160	   mode.  In this respect, we stress that, since an A+P subsystem
1161	   exhibits the same external behavior as a NAT, well-known workarounds
1162	   (such as [RFC3715]) can be employed.

1164	5.  IANA Considerations

1166	   This document makes no request of IANA.

1168	   Note to RFC Editor: this section may be removed on publication as an
1169	   RFC.

1171	6.  Security Considerations

1173	   The primary security issue any time a NAT is mentioned is the
1174	   implicit firewall provided by a NAT.  Any proposal to eliminate NATs
1175	   raises the spectre of insecure hosts lying naked before a hostile
1176	   Internet.  For a number of reasons, we do not think this is a serious
1177	   issue here.  If nothing else, NATs are not really security devices;
1178	   their protective value is limited.

1180	   A NAT owned by a customer, whether a home consumer or a large
1181	   enterprise, is under the control of that customer.  All machines on
1182	   the customer's side of the NAT have unfettered access to other
1183	   machines on the same side; generally, this is what is desired.  A+P
1184	   NATs do not change this, as the customer has still controls what is
1185	   being NATed.  LSN does not change the access property, either.
1186	   However, with a CGN without A+P there are *many* machines on the
1187	   inside of the translation, not all of which are in the customer's
1188	   administrative domain.  Unless other firewall mechanisms are
1189	   employed, LSNs create added risk of unauthorized access.

1191	   By contrast, the protection scope of an A+P NAT is, by definition, at
1192	   the boundary to the customer network.  The access properties are thus
1193	   precisely what traditional NATs have provided.

1195	   There is one notable exception to this point.  Inbound packets
1196	   addressed to the assigned port number range are passed through
1197	   unchanged, even if no outbound packets were sent to the originator.
1198	   While this allows customers to run their own servers on certain
1199	   ports, it also allows attackers to probe these servers without the
1200	   protection provided today by provider-supplied NAT boxes.  The issue
1201	   is not that internal machines are addressable -- that is an
1202	   inevitable corollary to servers being run -- but that it may
1203	   represent a change from today's behavior.  Furthermore, the effect on
1204	   the customer varies greatly, depending on what port number range they
1205	   are assigned; someone who is assigned 0-4K derives more benefit and
1206	   runs more risk than someone who is assigned 48K-52K, since the latter
1207	   is in the IANA-assigned dynamic port range.

1209	   A useful middle ground would be provision of a customer-controllable
1210	   switch in the CPE to control what happens to such packets.  If
1211	   filtering is to be done, state must be kept, which might be costly.
1212	   This suggests that perhaps it should only be done in the CPE if it is
1213	   replacing current CPE that provides NAT functionality.  If
1214	   applications on end-hosts installed A+P gateways, they might open up
1215	   ports untranslated.

1217	   Note that, regardless of the existence of such an option, the A+P
1218	   gateway will need customer-controllable port number-mapping
1219	   capability, as most customers will not be assigned a range which
1220	   corresponds to the servers they wish to run.

1222	   With CGN/LSNs, tracing hackers, spammers and other criminals will be
1223	   extremely difficult, requiring logging, recording, and storing of all
1224	   connection based mapping information.  The need for storage implies a
1225	   tradeoff.  On one hand, the LSNs can manage addresses and ports as
1226	   dynamically as possible, in order to maximize aggregation.  On the
1227	   other hand, the more quickly the mapping between private and public
1228	   space changes, the more information needs to be recorded.  This would
1229	   not only cause concern for law enforcement services, but also for
1230	   privacy advocates.

1232	   A+P offers a better set of tradeoffs.  All that needs to be logged is
1233	   the allocation of a range of port numbers to a customer.  By design,
1234	   this will be done rarely, improving scalability.  If the NAT
1235	   functionality is moved further up the tree, the logging requirement
1236	   will be as well, increasing the load on one node, but giving it more
1237	   resources to allocate to a busy customer, perhaps decreasing the
1238	   frequency of allocation requests.

1240	   The other extreme is A+P NAT on the customer premises.  Such a node
1241	   would be no different than today's NAT boxes, which do no such
1242	   logging.  We thus conclude that A+P is no worse than today's
1243	   situation, while being considerably better than CGNs.

1245	7.  Authors

1247	   This document has 8 primary authors, which is not allowed in the
1248	   header of Internet-Drafts.  This is the list of actual authors of
1249	   this document.

1251	      Gabor Bajko
1252	      Nokia
1253	      Email: gabor(dot)bajko(at)nokia(dot)com

1255	      Steven M. Bellovin
1256	      Columbia University
1257	      1214 Amsterdam Avenue
1258	      MC 0401
1259	      New York, NY  10027
1260	      US
1261	      Phone: +1 212 939 7149
1262	      Email: bellovin@acm.org

1264	      Randy Bush
1265	      Internet Initiative Japan
1266	      5147 Crystal Springs
1267	      Bainbridge Island, Washington  98110
1268	      US
1269	      Phone: +1 206 780 0431 x1
1270	      Email: randy@psg.com
1271	      Luca Cittadini
1272	      Universita' Roma Tre
1273	      via della Vasca Navale, 79
1274	      Rome,   00146
1275	      Italy
1276	      Phone: +39 06 5733 3215
1277	      Email: luca.cittadini@gmail.com

1279	      Alain Durand
1280	      Comcast
1281	      1 Comcast Center
1282	      Philadelphia, PA
1283	      US
1284	      alain_durand@cable.comcast.com

1286	      Olaf Maennel
1287	      Loughborough University
1288	      Department of Computer Science - N.2.03
1289	      Loughborough
1290	      United Kindom
1291	      Phone: +44 115 714 0042
1292	      Email: o@maennel.net

1294	      Teemu Savolainen
1295	      Nokia
1296	      Hermiankatu 12 D
1297	      TAMPERE, FI-33720
1298	      Finland
1299	      Email: teemu.savolainen@nokia.com

1301	      Jan Zorz
1302	      go6.si
1303	      Frankovo naselje 165
1304	      Skofja Loka  4220
1305	      Slovenia
1306	      Phone: +38659042000
1307	      Email: jan@go6.si

1309	8.  Acknowledgments

1311	   The authors wish to especially thank Remi Despres, and Pierre Levis
1312	   for their help on the development of the A+P approach.  We also thank
1313	   David Ward for review, constructive criticism, and interminable
1314	   questions, and Dave Thaler for useful criticism on "stackable" A+P
1315	   gateways.  We would also like to thank the following persons for
1316	   their feedback on earlier versions of this work: Rob Austein, Gert
1317	   Doering, Dino Farinacci, Russ Housley, and Ruediger Volk.

1319	9.  References

1321	9.1.  Normative References

1323	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1324	              Requirement Levels", BCP 14, RFC 2119, March 1997.

1326	9.2.  Informative References

1328	   [BCP38]    Ferguson, P. and D. Senie, "Network Ingress Filtering:
1329	              Defeating Denial of Service Attacks which employ IP Source
1330	              Address Spoofing", BCP 38, May 2000.

1332	   [I-D.bajko-pripaddrassign]
1333	              Bajko, G., Savolainen, T., Boucadair, M., and P. Levis,
1334	              "Port Restricted IP Address Assignment",
1335	              draft-bajko-pripaddrassign-01 (work in progress),
1336	              March 2009.

1338	   [I-D.boucadair-dhcpv6-shared-address-option]
1339	              Boucadair, M., Levis, P., Grimault, J., Savolainen, T.,
1340	              and G. Bajko, "Dynamic Host Configuration Protocol
1341	              (DHCPv6) Options for Shared IP Addresses  Solutions",
1342	              draft-boucadair-dhcpv6-shared-address-option-00 (work in
1343	              progress), May 2009.

1345	   [I-D.boucadair-port-range]
1346	              Boucadair, M., Levis, P., Bajko, G., and T. Savolainen,
1347	              "IPv4 Connectivity Access in the Context of IPv4 Address
1348	              Exhaustion: Port  Range based IP Architecture",
1349	              draft-boucadair-port-range-02 (work in progress),
1350	              July 2009.

1352	   [I-D.boucadair-pppext-portrange-option]
1353	              Boucadair, M., Levis, P., Grimault, J., and A.
1354	              Villefranque, "Port Range Configuration Options for PPP
1355	              IPCP", draft-boucadair-pppext-portrange-option-01 (work in
1356	              progress), July 2009.

1358	   [I-D.ietf-softwire-dual-stack-lite]
1359	              Durand, A., Droms, R., Haberman, B., Woodyatt, J., Lee,
1360	              Y., and R. Bush, "Dual-stack lite broadband deployments
1361	              post IPv4 exhaustion",
1362	              draft-ietf-softwire-dual-stack-lite-01 (work in progress),
1363	              July 2009.

1365	   [RFC1918]  Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G., and
1366	              E. Lear, "Address Allocation for Private Internets",
1367	              BCP 5, RFC 1918, February 1996.

1369	   [RFC3715]  Aboba, B. and W. Dixon, "IPsec-Network Address Translation
1370	              (NAT) Compatibility Requirements", RFC 3715, March 2004.

1372	Author's Address

1374	   Randy Bush (editor)
1375	   Internet Initiative Japan
1376	   5147 Crystal Springs
1377	   Bainbridge Island, Washington  98110
1378	   US

1380	   Phone: +1 206 780 0431 x1
1381	   Email: randy@psg.com