[rrg] Paul Jakma's blog article

Robin Whittle <rw@firstpr.com.au> Sat, 13 February 2010 03:08 UTC

Message-ID: <4B7617EB.5090800@firstpr.com.au>
Date: Sat, 13 Feb 2010 14:09:31 +1100
From: Robin Whittle <rw@firstpr.com.au>
Organization: First Principles
User-Agent: Thunderbird 2.0.0.23 (Windows/20090812)
MIME-Version: 1.0
To: RRG <rrg@irtf.org>
Content-Type: text/plain; charset="windows-1252"
Content-Transfer-Encoding: 8bit
Subject: [rrg] Paul Jakma's blog article
Precedence: list

Short version:    Paul's article fails to properly distinguish
                  between CES and CEE architectures.  The article
                  also implies that scalable routing solutions are
                  in general based on "Locator/Identifier Separation"
                  - but this is only true of CEE architectures.

                  His suggestion of using IPv4 header options to
                  extend TCP port numbers to 32 bits, for the
                  purpose of making a NAT box better able to
                  support multiple hosts behind NAT, is
                  unfortunately impractical.  This is due to DFZ
                  and routers in general handling packets with IPv4
                  header options on the "slow path".

Hi Paul,

On your page:

   http://pjakma.wordpress.com/2010/02/12/making-the-internet-scale-through-nat/

there are a number of statements which I think repeat mistakes made
by others:


PJ>    The IRTF’s RRG has been discussing various solutions.
PJ>    The IETF have a LISP WG developing one particular solution,
PJ>    which is quite advanced.

The LISP WG is to develop LISP with the ALT global query-server based
mapping system as an experimental protocol.  Mobility is out of
scope.   After after over two years' work, there is no indication
that ALT can scale well enough for mass adoption:

  ALT structure, robustness and the long-path problem
  http://www.ietf.org/mail-archive/web/lisp/current/msg01801.html

According to Noel Chiappa, the LISP team will soon be turning to a
DNS-based global mapping resolution system:

  http://www.ietf.org/mail-archive/web/rrg/current/msg05772.html


PJ>    The essential, common idea is to split an internet address
PJ>    into 2 distinct part, the "locator" and the "ID" of the
PJ>    actual host.

This is not a common idea among all RRG proposals.  This
"Locator/Identifier Separation" approach is used by all the Core-Edge
Separation (CES) approaches:

   GSE (not an RRG proposal)
   GLI-Split
   ILNP
   Name Based Sockets

and, although I haven't read them yet, I think these two as well:

   Name Overlay (NOL)
   RANGI

"Locator/Identifier Separation" means that separate objects, in
separate namespaces are used for uniquely Identifying hosts and for
specifying their Location (either exactly, or down to the level of a
particular ISP or end-user network).

This is a complete change from the current two-level naming structure
of IPv4 and IPv6, in which the Identifier and Locator functions are
both played by the one object: an IP address:

     Role             Level           Conventional IP naming model

     Text name  <---- FQDN
     Identifier <---] IP address
     Locator    <---]

Despite LISP being an acronym for "Locator/Identifier Separation
Protocol", LISP is not a CEE architecture and does not implement
"Locator/Identifier Separation"

LISP is a Core-Edge Elimination (CES) architecture.  These are the
CES architectures. Those in brackets are not RRG proposals and are
not longer being developed:

   (map-and-encaps)
   (LISP-NERD)
    LISP-ALT
   (APT)
    Ivip
   (TRRP)
   (Six/One Router)
    TIDR
    IRON-RANGER

All CES architectures preserve the current 2 level IP naming model
and require no changes to host stacks or applications.  They work
with both IPv4 and IPv6, except for Six/One Router, which is IPv6-only.

CEE architectures are only practical for IPv6 and always require
changes to host stacks.  Most CEE architectures require changes to
applications, but GSE and GLI-Split do not.

Please see these messages for more on the distinction between CES and
CEE architectures, and on the 2008 paper which most popularised these
terms - which I think are the most important architectural
distinction between most of the scalable routing proposals.  (Not all
proposals are CEE or CES.)

   Today's "IP addr. = ID = Loc" naming model should be retained
   http://www.ietf.org/mail-archive/web/rrg/current/msg05864.html

   CES & CEE are completely different (graphs)
   http://www.ietf.org/mail-archive/web/rrg/current/msg05865.html

   CES & CEE: GLI-Split; GSE, Six/One Router; 2008 sep./elim. paper
   http://www.ietf.org/mail-archive/web/rrg/current/msg06009.html


PJ>    The core routing fabric of the internet then needs only to be
PJ>    concerned with routing to "locator" addresses.

This is true of CEE architectures.  This is because hosts are then
Identified by something different from IP addresses.  So the
host-to-host sessions survive changes in the IP addresses used by the
hosts, and renumbering a network when choosing a new ISP does not
alter the identity of the hosts.  CEE architectures multihome by
giving each host two or more IP addresses, one from each ISP.

Your statement does not apply to CES architectures.  For CES, the DFZ
routers are primarily concerned with the ISP's prefixes, which use
the "core" subset of the global unicast address space, which is what
remains after a new subset of scalable "edge" addresses have been
removed.  These "edge" addresses are supported by the CES mechanisms
and can be used by end-user networks for portability, multihoming,
inbound TE and perhaps mobility - in a highly scalable way, meaning
without much impact on the DFZ control plane or the RIBs and FIBs of
DFZ routers.

IP addresses in both the "core" and "edge" sets function both as
Identifiers and Locators - but ITRs treat packets differently when
their destination address is an "edge" address.  The look up mapping
and tunnel the packet to an ETR address, which is different from the
usual approach of forwarding the packet to a neighbouring router.


PJ>    Figuring out how to map onward to the (presumably far more
       numerous and/or less aggregatable) "ID" of the specific
       end-host to deliver to is then a question that need only
       concern a boundary layer between the core of the internet and
       the end-hosts.  There are a whole bunch of details here
       (including the thorny question of what exactly the "ID" is in
       terms of scope and semantics) which we’ll try skip over as
       much as possible for fear of getting bogged down in them.

It is not an "ID".  Your statement would be true of CES architectures
if "edge address" was used, meaning an "EID" address for LISP or an
SPI (Scalable PI) address for Ivip.

Part of the terminology problem in this field is that LISP is
incorrectly named and its EID term - "Endpoint Identifier" - is a
misnomer too.  The term "EID" is used beyond LISP.  I don't use in
Ivip.  In LISP, EIDs are "edge" addresses which are just as much
combined Identifiers and Locators as "core" addresses.  Its just that
the ITRs use a different algorithm for getting the packets towards
their destination - one which does not rely on each "EID" prefix
being advertised in the DFZ.

CES architectures generally have a mapping system by which an ITR can
find out, for any given "edge" address it finds in the destination
field of a traffic packet, the "core" address of an ETR to which it
should tunnel this packet.  This has nothing to do with "Identifiers"
and "Locators".

CEE architectures usually have mapping systems so a query for a
particular Identifier will return the one or more Locators on which
this host can be reached.

PJ>    We will note that some proposals, in order to be as
       transparent and invisible to end-host transport protocols
       as possible, use a "map and encap" approach – effectively
       tunneling packets over the core of the internet.

These are all CES architectures.  CEE architectures have no need for
tunneling.

PJ>    Some other proposals use a NAT-like approach, like Six-One or
       GSE, with routers translating from inside to outside.

"Six/One" is an earlier host-based proposal from Christian Vogt -
arguably an improved version of Shim6.  I think you are referring to
"Six/One Router", also from Christian Vogt.  This is a CES
architecture, which uses address rewriting rather than encapsulation
for tunneling traffic packets across the DFZ.  Please see (msg06009)
for why it is a CES architecture.

GSE is an earlier CEE architecture.  ILNP and GLI-Split are inspired
by GSE.  Six/One Router is not related to GSE - it is a CES
architecture.  In (msg06009) I argue that the 2008 paper which did
most to establish the use of the CES and CEE terms mistakenly
classified GSE as a CES proposal.


PJ>    Most proposals come with reasonable amounts of state, some
       proposals appear to have quite complex architectures and/or
       control planes.

Yes, but there are big differences in where the complexity is added.

CES architectures do not require any additional complexity, or any
other changes, in host stacks or applications.  They work by adding
some new elements to the routing system - and a mapping system - but
do not require alterations to most routers.

As far as I know, CEE architectures fall into two groups.  One
involves no additional router functions, but requires extensive
changes to host stacks and applications.  The other group has some
extra router functions (address rewriting of incoming and outgoing
packets) and requires stack changes, but no application changes.

These differences, and the fact the CES maintains the current IP
naming system while CEE completely changes it, means that the
distinctions between these two types of solution are highly
significant and helpful when discussing scalable routing solutions.

Both types of architecture, if successfully implemented and widely
adopted, could in principle solve the routing scaling problem.
However, there are stark differences in the relationships between
benefits and adoption levels (msg05865).

Good CES architectures provide immediate benefits to adopters, by
supporting all their traffic, and provides scaling benefits in
proportion to the adoption rate.

CEE architectures can only provide substantial benefits to adoptors
and only provide real scaling benefits, when all, or almost all,
networks adopt the new architecture.  This means moving everyone to
IPv6 and altering all host stacks to implement the particular CEE
architecture's alteration of IPv6.  Except for GLI-Split, this also
means rewriting essentially *all* applications.  (Still, for
GLI-Split, IPv6 would need to be ubiquitously adopted and many or
most applications would need to be significantly rewritten to work
with IPv6.)


PJ>    E.g. LISP in particular seems quite complex, relative to the
       "dumb" internet architecture we’re used to, as seems to try
       to solve every possible IP multi-homing and mobile-IP problem
       known to man.

LISP and Ivip both require considerable complexity.  This is not
surprising, since we are planning a once in several decades
enhancement for the IPv4 and IPv6 Internets - with the work to be
done in the network, without requiring host changes.

My proposal Ivip includes the TTR Mobility extension, which will work
for both IPv4 and IPv6 - and an alternative to encapsulation.  So it
is pretty ambitious, as I think any architectural upgrade for the
Internet should be.

   http://www.firstpr.com.au/ip/ivip/

Both LISP and Ivip involve significant complexity.  Ivip ITRs and
ETRs will be less complex than LISP ITRs and ETRs, because the
end-user controls the mapping in real-time, relieving the ITRs of the
task of reachability testing through the ETRs to the destination
networks.  So Ivip ITRs do no such testing and make no choices
between ITRs.


PJ>    Some proposals also rely on IPv6 deployment.

This statement applies to all CEE architectures - they are all for
IPv6 and not at all for IPv4.  One fundamental reason for this is
that with CEE, an end-user network with X IP addresses requires X
addresses from each of its ISPs, and at least two ISPs are required
for multihoming.  For IPv4, this would double the address
requirements - which is clearly unworkable.

The only CES architecture which only works for IPv6 is Six/One Router
- but this is no longer being developed and apparently required a
change to the IPv6 headers to use a bit for a flag.


You suggest redesigning TCP to have a 32 bit port number - since this
would help with a NAT box with a single IPv4 address being able to
handle traffic for more hosts behind NAT.  However, any such new TCP
protocol would only work between two hosts which had upgraded stacks
and suitably upgraded applications too.  I think upgrading
applications is the hardest task of all.

You mention options for the IP header.  As far as I know, IPv6
extension headers can be used.  I understand most or all routers
handle such packets without any fuss - and I guess host stacks which
did not recognise them would ignore them.

Unfortunately the same is not true of IPv4 header options.  One of
the RRG proposals (hIPv4) relied on these, but DFZ routers handle
such packets on the "slow path", making this entirely impractical:

  End-to-end measurements on performance penalties of IPv4 options
  Fransson, P.; Jonsson, A.
  Global Telecommunications Conference, 2004. GLOBECOM apos;04. IEEE
  Volume 3, Issue , 29 Nov.-3 Dec. 2004 Page(s): 1441 - 1447 Vol.3
  10.1109/GLOCOM.2004.1378221
  http://www.sm.luth.se/csee/csn/publications/end_to_end_measurements.pdf

most routers process such packets on the "slow path" with software.
The last sentence in the abstract is:

  From the analysis it can be concluded that there is a slight
  increase in delay and jitter and a severe increase in loss rate.

I think the latter part of your article is focussed on IPv4, since
your aim is to make NAT more workable.  Unfortunately, it seems that
the "slow path" handling of packets with IPv4 option headers rules
out any solution along the lines you are suggesting.

  - Robin

[rrg] Paul Jakma's blog article Robin Whittle
Re: [rrg] Paul Jakma's blog article Noel Chiappa
Re: [rrg] Paul Jakma's blog article Paul Jakma
Re: [rrg] Paul Jakma's blog article Robin Whittle
Re: [rrg] Paul Jakma's blog article Paul Jakma
Re: [rrg] Paul Jakma's blog article Robin Whittle
Re: [rrg] Paul Jakma's blog article Paul Jakma
[rrg] weighing core network v edge network obstac… Paul Jakma
Re: [rrg] Paul Jakma's blog article Robin Whittle