[lisp] Critique of Mobile LISP: draft-meyer-lisp-mn-00

Robin Whittle <rw@firstpr.com.au> Thu, 30 July 2009 06:49 UTC

Return-Path: <rw@firstpr.com.au>
X-Original-To: lisp@core3.amsl.com
Delivered-To: lisp@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 2DE923A715C for <lisp@core3.amsl.com>; Wed, 29 Jul 2009 23:49:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.895
X-Spam-Level:
X-Spam-Status: No, score=-1.895 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HELO_EQ_AU=0.377, HOST_EQ_AU=0.327]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id u+B02jqMZqcv for <lisp@core3.amsl.com>; Wed, 29 Jul 2009 23:49:05 -0700 (PDT)
Received: from gair.firstpr.com.au (gair.firstpr.com.au [150.101.162.123]) by core3.amsl.com (Postfix) with ESMTP id 1815C3A70A2 for <lisp@ietf.org>; Wed, 29 Jul 2009 23:48:35 -0700 (PDT)
Received: from [10.0.0.6] (wira.firstpr.com.au [10.0.0.6]) by gair.firstpr.com.au (Postfix) with ESMTP id 2A927176101; Thu, 30 Jul 2009 16:48:36 +1000 (EST)
Message-ID: <4A71424A.4000801@firstpr.com.au>
Date: Thu, 30 Jul 2009 16:48:42 +1000
From: Robin Whittle <rw@firstpr.com.au>
Organization: First Principles
User-Agent: Thunderbird 2.0.0.22 (Windows/20090605)
MIME-Version: 1.0
To: lisp@ietf.org
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
Subject: [lisp] Critique of Mobile LISP: draft-meyer-lisp-mn-00
X-BeenThere: lisp@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: List for the discussion of the Locator/ID Separation Protocol <lisp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/lisp>, <mailto:lisp-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/lisp>
List-Post: <mailto:lisp@ietf.org>
List-Help: <mailto:lisp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lisp>, <mailto:lisp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 30 Jul 2009 06:49:07 -0000

Short version:   This LISP Mobile approach can't work when the MN's
                 address is behind NAT, as it would be on virtually
                 all domestic, SOHO and commercial Internet services
                 today.  This applies whether the NAT box's public
                 address is a conventional "RLOC" address or a LISP-
                 mapped EID address.

                 It cannot typically achieve continual connectivity
                 when the MN changes to a new address (what would be
                 called a new Care-of Address in conventional MIP or
                 with the TTR Mobility approach).

                 It requires double encapsulation when the MN is on
                 a LISP-mapped EID address.  This is exceedingly
                 burdensome for VoIP packets in IPv6, due to the
                 need to duplicate the IPv6, UDP and LISP headers.
                 This also worsens the Path MTU Discovery problems.

                 For instance, if the MN moves from an RLOC address
                 to an EID address, all the ITRs will need to add
                 a second set of headers.  However, the ITRs have
                 no way of informing the sending hosts that there
                 is now a reduced value for the maximum packet length
                 they should send.

                 Every MN needs a dedicated address for its ETR
                 function.  This is OK in IPv6 but very inefficient
                 in IPv4.

                 The MN's reliance on a frequently distant
                 Map-Resolver will further delay initial packets
                 and reduce the reliability of the system.

                 The TTR Mobility approach has none of these
                 problems.


I read the I-D:

  http://tools.ietf.org/html/draft-meyer-lisp-mn-00

This mentions (p 5):

     Proxy ETR (PETR):  An infrastructure element used to decapsulate
     packets sent from mobile nodes non-LISP sites.  Proxy ETRs are
     described in [LISP-IW].       ^

There seems to be a word missing before "non-LISP".  Maybe "to
non-LISP sites."?

"Proxy ETRs" are not described in:

   http://tools.ietf.org/html/draft-lewis-lisp-interworking-02

I don't understand how a "Proxy ETR" could be used while adhering to
one of the User Requirements (5.1), which is to avoid triangle routing:

   Shortest Path Data Plane:  The LISP-MN architecture MUST allow for
      shortest path bidirectional traffic between a LISP mobile node
      and a stationary node, and between a LISP mobile node and
      another LISP mobile node (i.e., without triangle routing in the
      data path).  This provides a low-latency data path between the
      LISP mobile node and the nodes that it is communicating with.

More on this below.

The 3rd line of page 8 may be missing "changed" or similar after "are
not".

I see on page 9 that Map Servers can act as query servers, rather
than simply forwarding mapping replies to ETRs.  I recall discussing
this a few months ago and being told that they did not do this.  I
think it is a good idea that they be able to respond to the query
themselves.

I also see that MNs never send packets directly to ordinary IP
addresses ("RLOC" addresses, of hosts or whatever which are not on
LISP-mapped "EID" addresses).  Instead, the packet is tunneled to a
PETR (Proxy ETR).  This seems to be at odds with the requirement to
avoid "triangle routing".


Also on page 9:

   Note that a LISP mobile node will need additional interworking
   infrastructure when talking to non-LISP sites [LISP-IW]; this is
   consistent with the design of any host at a LISP site which talks
   to a host at a non-LISP site.

Hosts operating on LISP-mapped addresses have never needed any
additional infrastructure to send packets to (my understanding of
"talking to") hosts with conventional addresses.

Hosts on conventional ("RLOC") addresses in networks without ITRs
have always required "additional infrastructure" to be able to send
packets to hosts on LISP-mapped EID addresses.   I proposed this in
June 2006 and LISP acquired such a mechanism - as a Proxy Tunnel
Router (ITR in the DFZ) - in November 2006.

The following paragraph (pages 9 and 10) explicitly restates that
packets sent from MNs to hosts (stationary or using conventional
mobile IP techniques) on non-LISP networks (that is on "RLOC" rather
than "EID" addresses) will be sent via an intermediate Proxy ETR.
This seems to be at odds with the above mentioned user requirement
for no triangle routing.

On page 11 the location of the Proxy ETR is discussed:

   In general, the PETR will be co-located with the LISP mobile
   node's Map-Server.

(See also discussion of section 12.1)

Generally, no matter where in the world the MN is located, it only
uses a single Map-Server, which is determined by the MN's EID
address.  This EID address and therefore the identity and location of
the Map-Server will not alter from one month to the next, or one year
to the next.   So if the Map-Server is in California and the MN is in
Bangladesh, as long as the Proxy ETR is the same device as the
Map-Server, then if the MN has a packet to send to a non-LISP
destination host in India, the packet will be encapsulated to
California and then be sent in its raw form to India.

This problem is acknowledged:

   This may add stretch to packets sent from a
   LISP mobile node to a non-LISP destination.

However, this acknowledgement doesn't go far enough.  The truth of
the matter is that this violates the above user requirement to avoid
triangle routing - and it would surely make this form of mobility
unacceptable to most users.

So far in the I-D, there has been no discussion of why this Proxy-ETR
approach is required.  I guess it is required because the local
network which the MN has an address on is likely to have filtering
which would drop any packet sent from the MN with its EID address as
the source address, since this EID would not be part of the local
network's address range.

(This problem is solved in the TTR approach to mobility:
 http://www.firstpr.com.au/ip/ivip/#mobile by the MN forming a
 two-way encrypted tunnel to one or more Translating Tunnel Routers
 which are ideally nearby, and which send such packets, in the raw
 directly to the destination host.)


On page 13, while I haven't tried to understand every aspect of how
the MN updates the mapping information in ITRs of devices which are
sending packets to it, I am concerned about the requirement that this
update in some circumstances be not achieved at all - but rather that
the desired result is achieved by the ITR's mapping cache timing out.

  In particular, a LISP mobile node SHOULD set the TTL on the
  mappings in its Map-Replies to be in 1-2 minute range.

Does this mean either or both of the following?

  1 - That MNs will burden ITRs with the need for frequent
      lookups of mapping, due to the short caching time.

  2 - That this caching time of 1 to 2 minutes also controls how
      long it will take before an ITR can successfully tunnel
      packets to a MN which has changed its RLOC address?

Point 1 is a scaling and efficiency problem - and can most easily be
rectified by making the time very long, such as 30 minutes or a few
hours.

Point 2 is a direct constraint on connectivity when the MN changes
its point of connection, which some MNs will do frequently.  This
problem can be reduced by reducing the caching time.

So the goal of mobile connectivity seems to be directly at odds with
the scalability of the LISP system.

   (With the TTR approach, the mapping does not need to change
    whenever the MN gets a new CoA.  It only changes
    when it chooses to use a new TTR.  A new TTR may not be
    required for months - as long as the CoA is within 1000km
    or so - and connectivity is retained no matter how far
    apart the CoA and TTR are.)

Does this mean that in order to keep LISP scalable, and to minimise
the burden on ITRs and the mapping resolution infrastructure, that
when a MN moves to a new RLOC address, its user must put up with
(typically) 1 or 2 minutes of lost connectivity?   This would be
completely unacceptable to most users, even if the application-level
sessions survived - and most sessions would time out.

(I will pass over the section on multicast - I haven't read the LISP
multicast stuff and since multicast doesn't work on the global
Internet today, I have never been able to figure out why LISP is
concerned with it.)

Section 9.1 discusses the MN being on a private address, such as
10.x.x.x or whatever - and therefore implicitly behind NAT.  It cites
a not-yet-existent I-D: "draft-x-lisp-nat-traversal-00.txt (work in
progress), June 2009."  I can find no mention of such a draft in the
mailing list.

I cannot imagine how this could work.  Furthermore, I am sure it
won't work.

ITRs need to be able to tunnel packets to ETRs without any prior
arrangement.  The MN can't directly know the address of its NAT box.

There might be some tricky way the MN's Map-Server could figure out
the public address of the NAT box, but there's no way a properly
written NAT function would allow packets sent by some ITR to be sent
to the MN.

LISP MNs cannot work if they are behind NAT - either where the NAT
box's public address is in conventional ("RLOC") space or is a
LISP-mapped EID address.


Section 9.2 discusses the common scenario in which the MN's address
is a LISP-mapped EID address.

Packets sent to the MN need to go through the following steps:

  1 - First level encapsulation to create a second header (the
      first header is that of the original packet, and contains
      in its destination field the EID address of the MN).

      This encapsulation is performed by one of:

        a - For a non-LISP-mobile sending host in a conventional LISP
            site: an ITR in that site.

        b - For a non-mobile or conventionally mobile sending host in
            a site without an ITR: in a Proxy Tunnel Router (PTR -
            ITR in the DFZ).

        c - For a LISP-mobile sending host: in the ITR function of
            that MN.

      The destination address of this second header is the EID
      address which the destination MN is using in a manner analogous
      to the "Care-of-Address" CoA in conventional mobile IP or in
      the TTR Mobility approach.   This is the address by which
      the destination MN's ETR function can be reached.

  2 - Second level encapsulation - the same ITR as mentioned above
      recognises that it has just applied an EID address in the
      previous step (normally, prior to this Mobility I-D, all
      such encapsulation was to an RLOC address, which explicitly
      cannot be an EID address), so it looks up the mapping for that
      EID address and finds it mapped to a real RLOC address.
      (I will assume the MN's "CoA" is not within the network of
      another LISP MN's network, which would require another level
      of headers still.)

      The destination address of this second header is the ETR
      address of the site at which the MN has its address (the
      address I consider to be its CoA address, although the I-D
      does not use this terminology).

  3 - The resulting packet is forwarded, typically across the DFZ,
      and arrives at the ETR of the MN's site.  That ETR pops off
      the 3rd header and forwards the packet to the internal address
      within that network which is the MN's "CoA" address - within
      the LISP-mapped EID space this network uses.

  4 - The packet arrives at the MN's ETR function, which strips off
      the 2nd header and passes the original packet to its stack
      for normal processing.

There are at least two problems with this arrangement:

  1 - This doubles the packet overhead.  This is particularly a
      concern for IPv6, where it is doubling the IPv6 header, the UDP
      header and the LISP header.  Now the headers are way longer
      than many data packets, including especially VoIP packets.

  2 - This worsens the problems with Path MTU Discovery - longer
      packets exceeding MTU limits.

      For instance, if the MN is originally on an RLOC address then
      the LISP ITR and the sending host *may* be able to set up
      the sending host's packet length to match the limitations
      of the path with one set of headers (IP, UDP and LISP).

      However, if the MN moves to a LISP-mapped EID address, then
      the ITR will need to apply a second set of headers.  It has
      no way of conveying to the sending host that the maximum
      packet length should now be shorter - so either the packets
      will be dropped at the limiting router, or the ITR would need
      to fragment the packet and send it as two, for the ETR in the
      destination network to reassemble.

There also needs to be some logic in the ITR to ensure that it won't
keep adding headers if the mapping of one EID is also to an EID which
is mapped to another EID.


Section 12.1 contains a requirement that the Proxy ETR only accept
packets sent from RLOCs which it is authorised to receive packets
from.  This authorisation information must come from the Map-Server
which handles the EID which the MN's EID prefix is a part of.

Even if a particular RLOC address is registered, this does not ensure
that only packets from the MN will be handled by the Proxy ETR.
There's no way the Proxy ETR could verify the real source of packets.
 The outer header's source address could be spoofed and so could the
inner header's.

In summary, the problems I see with this LISP Mobile approach include:

  1 - Cannot work with the MN behind NAT, whether the NAT box's
      public address is an ordinary RLOC address or on a LISP-mapped
      EID address.

  2 - Requires "triangle routing" via a Proxy ETR for packets sent by
      the MN to non-LISP destination hosts.

          In principle, the Map-Server could have multiple Proxy-ETRs
          all over the world and somehow tell the MN to use one
          closest to its current "CoA" address.  This would start
          to resemble the TTR Mobility approach.

  3 - Requires double encapsulation if the MN's "CoA" is on a LISP-
      mapped EID address.  This is inefficient and leads to worse
      PMTUD problems

  4 - AFAIK, in some or many scenarios, there would be lost
      connectivity when the MN changes its "CoA" - or at the very
      least a really undesirable trade-off between the goals of
      reducing this and of making the LISP system scalable.

  5 - Does not scale will in IPv4, since each MN needs a dedicated
      address for its ETR function in addition to its own EID
      address.  This ETR address cannot be used by any other device.

      While that ETR address may well be a LISP-mapped EID address,
      this is still chewing two addresses from the IPv4 space for
      a single mobile device.

  6 - Relying on a typically distant Map-Resolver will increase the
      delay times for the MN's internal ITR function being able to
      send initial packets.  It will also reduce the reliability
      of the system compared to the usual LISP arrangement where
      the Map-Resolver is in the sending host's own network.


Dino and colleagues:  Steve Russert and I proposed the TTR Mobility
approach in August last year:

   http://www.firstpr.com.au/ip/ivip/#mobile

The TTR Mobility approach is applicable to LISP.

MNs work fine behind NAT and there is no problem with triangle
routing, assuming that there is a network of TTRs such that the MN
can generally use one which is nearby.  (These would be commercially
profitable systems, so there would be competition between TTR network
providers.)

There is no loss of connectivity when the MN changes its CoA,
assuming it has a few seconds to set up one CoA before losing the
current one.   Connectivity will be continual from all CoAs, no
matter how distant the TTR, but optimal path lengths, lower latency
and reduced packet loss rates will be achieved when the MN uses a
nearby TTR.

>From what I can see the TTR approach would be better than what you
are proposing in this new LISP Mobility draft.

What objections would you have to using the TTR approach instead?

  - Robin