[BEHAVE] comments on draft-donley-nat444-impacts

"Dan Wing" <dwing@cisco.com> Thu, 04 November 2010 17:09 UTC

Return-Path: <dwing@cisco.com>
X-Original-To: behave@core3.amsl.com
Delivered-To: behave@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 98A3528C0DB for <behave@core3.amsl.com>; Thu, 4 Nov 2010 10:09:49 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -109.594
X-Spam-Level:
X-Spam-Status: No, score=-109.594 tagged_above=-999 required=5 tests=[AWL=-0.854, BAYES_20=-0.74, RCVD_IN_DNSWL_HI=-8, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id c0PZ6amwfP17 for <behave@core3.amsl.com>; Thu, 4 Nov 2010 10:09:47 -0700 (PDT)
Received: from sj-iport-1.cisco.com (sj-iport-1.cisco.com [171.71.176.70]) by core3.amsl.com (Postfix) with ESMTP id 1467428C0DC for <behave@ietf.org>; Thu, 4 Nov 2010 10:09:47 -0700 (PDT)
Authentication-Results: sj-iport-1.cisco.com; dkim=neutral (message not signed) header.i=none
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Av0EAE6F0kyrRN+K/2dsb2JhbACVF4xicYgkmlmbPoVGBIRXhQ6Lew
X-IronPort-AV: E=Sophos;i="4.58,296,1286150400"; d="scan'208";a="377539423"
Received: from sj-core-4.cisco.com ([171.68.223.138]) by sj-iport-1.cisco.com with ESMTP; 04 Nov 2010 17:09:57 +0000
Received: from dwingWS ([10.32.240.198]) by sj-core-4.cisco.com (8.13.8/8.14.3) with ESMTP id oA4H9val027353; Thu, 4 Nov 2010 17:09:57 GMT
From: Dan Wing <dwing@cisco.com>
To: draft-donley-nat444-impacts@tools.ietf.org, c.donley@cablelabs.com, william.howard@twcable.com, victor.kuarsingh@rci.rogers.com, abishek.chandrasekaran@colorado.edu, vivek.ganti@colorado.edu
Date: Thu, 04 Nov 2010 10:09:57 -0700
Message-ID: <094101cb7c43$1b7ad940$52708bc0$@com>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Office Outlook 12.0
Thread-Index: Act8QxslBzWPBYSUTD6Urmu+hUqOiw==
Content-Language: en-us
Cc: behave@ietf.org
Subject: [BEHAVE] comments on draft-donley-nat444-impacts
X-BeenThere: behave@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: mailing list of BEHAVE IETF WG <behave.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/behave>, <mailto:behave-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/behave>
List-Post: <mailto:behave@ietf.org>
List-Help: <mailto:behave-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/behave>, <mailto:behave-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 04 Nov 2010 17:09:49 -0000

draft-donley-nat444-impacts-01 is somewhat misleading.  It claims to analyze
NAT444, but it really analyzes what fails when two problems occur: (a) port
forwarding isn't configured and (b) UPnP is unavailable or is broken.
Several architectures share those two problems:

  * NAT444 (NAPT44 in the home + NAPT44 in the carrier's network)
  * LSN (NAPT44 in the carrier's network, without a NAPT44 in the home)
  * DS-Lite (which is an LSN / NAPT44 in the carrier's network)
  * stateful NAT64

It is unfortunate, to put it mildly, that draft-donley-nat444-impacts-01
singles out NAT444 as the sole architecture with these two problems.  

In all four architectures, port forwarding isn't available via UPnP IGD
because the large-scale NAT doesn't support UPnP IGD.  In the first
architecture (NAT444), the in-home NAT might still have UPnP IGD enabled,
which can confuse applications attempting to use UPnP IGD.  Was UPnP IGD
enabled in the home CPE in your tests?  The error message from the XBox test
*implies* that UPnP IGD was disabled, but I don't know for sure.  UPnP IGD
needs to be disabled in the CPE in all four of the above architectures,
because UPnP IGD doesn't know how to speak to upstream NATs (and no carrier
NAT on the market supports UPnP IGD).  When the PCP working group specifies
a UPnP IGD-to-PCP interworking function, and it is implemented in CPE, it
will be safe (and useful) to enable UPnP IGD on the in-home CPE in all three
scenarios.  Until then, enabling UPnP IGD in a CPE is harmful when the CPE
is behind any sort of NAT.



Detailed review:

"2.1. NAT444 Additional Challenges

   There are other challenges that arise when using shared IPv4 address
   space, as with NAT444.  Some of these challenges include:

   "o  Loss of geolocation information - Often, translation zones will
      cross traditional geographic boundaries.  Since the source
      addresses of packets traversing an LSN are set to the external
      address of the LSN, it is difficult for external entities to
      associate IP/Port information to specific locations/areas."

   o  Lawful Intercept/Abuse Response - Due to the nature of NAT444
      address sharing, it will be hard to determine the customer/
      endpoint responsible for initiating a specific IPv4 flow based on
      source IP address alone.  Content providers, service providers,
      and law enforcement agencies will need to use new mechanisms
      (e.g., logging source port and timestamp in addition to source IP
      address) to potentially mitigate this new problem.  This may
      impact the timely response to various identification requests.
      See [I-D.ietf-intarea-shared-addressing-issues]

   o  Antispoofing - Multiplexing users behind a single IP address can
      lead to situations where traffic from that address triggers
      antispoofing/DDoS protection mechanisms, resulting in
      unintentional loss of connectivity for some users."

All three of these points are not specific to NAT444.  They are aspects
common to any address sharing system, including any large-scale NAT --
NAT444, LSN, DS-Lite, NAT64 (especially stateful NAT64), A+P -- and has no
difference if the subscriber operates a NAT in their home.
draft-ietf-intarea-shared-addressing-issues does a good job of summarizing
all of the issues.  Thus, these are not "Additional Challanges" of NAT444,
as stated in the title of Section 2.1.


Section 3's diagrams would be clearer if, instead of "home router", the
diagrams clearly showed the device in the home is a NAPT44.

 
Section 3 -- were tests not run without the LSN?  That is, were tests run
with just the "home router", with UPnP disabled?  This would separate
failures that are caused by "port forwarding not enabled" from failures
caused by NAT444 itself from failures caused by the LSN itself.


Section 3 -- was an RFC1918 address used between the in-home CPE and the
LSN?  If so, did the in-home CPE disable its NAT function (turn itself into
a bridge)?  If not, were any problems attributable to the in-home CPE
enabling its 6to4 function, or the in-home CPE mistakenly thinking its WAN
address was publicly routable on the Internet?


Section 3.1,

   "Also, large FTP downloads experienced issues when
    translation mappings timed out."

I assume that refers to the FTP control connection being timed out while the
data connection is active.

In a NAT444 environment, there are two elements that might time out FTP's
control TCP connection:  (a) the home NAT or (b) the large-scale NAT:

a. If the home NAT causes the timeout, that problem exists today with
    all subscribers using that specific firmware of that specific home
    NAT. This is not a problem caused by NAT444.
 b. If the LSN causes the timeout, this affects all LSN deployments
    (stateful NAT64, NAT444, DS-Lite, etc.), and is something that
    should be captured in a requirements document somewhere ("for
    FTP, don't time out the control connection if the data
    connection is still active" or suchlike).  This is not a problem
    caused by NAT444.

I can't tell if the FTP problem mentioned in 3.1 ("mappings timed out") is
the same as "performance degraded on very large downloads", mentioned in
Section 4.1.  Is it the same?


Section 3.1,

   "Bittorrent seeding also failed during some tests."

Not caused by NAT444.

draft-boucadair-behave-bittorrent-portrange can explain why, but 
the problem is that port forwarding is needed for successful BitTorrent
seeding.  The inability to forward ports is a problem of all LSN 
architectures at this time (NAT444, LSN, DS-Lite, NAT64).  An ISP-operated
portal and recipe (like http://portforward.com) could solve the problem
for all of the LSN architectures.  Or wait for PCP.


Section 3.2, 

  "Torrent leeching was performed
   from the two clients to a public server in the Internet.  The
   observed speed was considerably slower than with only one client
   connected to the home network."

Not caused by NAT444.

This is a problem with *any* address sharing when two hosts behind the
same IP address are attempting to retrieve the same data from the same
remote BitTorrent peer.  BitTorrent behaves like that on purpose, in 
order to reduce the threat of leeching bandwidth.  The bt.allow_same_ip 
setting, on the remote BitTorrent hosts, controls this behavior.  See 
draft-boucadair-behave-bittorrent-portrange which is an extensive
analysis of BitTorrent behavior in conjunction with IP address
sharing.

draft-boucadair-behave-bittorrent-portrange also shows that, absent
port forwarding, BitTorrent leeching is considerably slower than with
port forwarding.  Anyone can verify that behavior in the comfort of
their own home.  I found the same torrent, after 4 hours of running,
was downloading at only 50kbps without port forwarding but would
download 10x faster with port forwarding.  A large legal torrent
is one of the ~160Gb torrents from three Nine Inch Nails concerts 
listed at http://forum.nin.com/bb/read.php?52,378166 -- they have 
only a few seeds to the behavior of (stingy) peers is more obvious.


Section 3.2,

  "It is generally noted that the performance decreases in
   bandwidth intensive applications."

Sure, but is that different from today's behavior (without a LSN) 
in some way?  


  "Netflix video
   streaming is also observed to be considerably choppy.  When streaming
   starts on one client, it does not start on the other, generating a
   message saying that the Internet connection is too slow.

Unlikely a fault of NAT444.

Was the Internet connection in fact too slow?  That is, was this test 
successful without an LSN?  Was this test successful with two hosts in 
the same home network without a NAPT44 in the home?  I doubt "NAT444" was 
at fault here, either.  But this I-D doesn't indicate if there was other
testing of this scenario to rule out NAT444 as the cause, a bona-fide
bandwidth issue, the in-home NAPT44 is the cause, or what.  


Section 4.1,

  Bittorrent leeching:  pass

Without port forwarding, I don't think I would characterize this as "pass"
-- it is painfully slow in my experience to do BitTorrent leeching without
port forwarding.  But that is not a problem solely with NAT444:  it is a
problem of any type of NAT (LSN, NAT444, DS-Lite, home NAT) if port
forwarding isn't configured.


  Xbox network test: fail: Your NAT type is moderate.  For best online 
                           experience you need an open NAT configuration.  
                           You should enable UPnP on the router.

Not a problem specific to NAT444.

The error "your NAT type is moderate" means some ports aren't forwarded.  I
found this nifty article that explained the error -- it's caused by port
forwarding not working
http://www.question-defense.com/2010/02/15/xbox-360-your-nat-type-is-moderat
e-configure-port-forwarding-on-your-router, which says:

  "You will need to add two port forwards which include port 3074 and 
   port 53 and make sure in both cases that you are forwarding both 
   the UDP and TCP ports."

and which is also stated in "Open network ports" section of the following
link on the XBox website,
http://support.xbox.com/en-us/pages/xbox-live/troubleshoot/marketplace.aspx

If this information is accurate -- and it certainly appears to be accurate
-- XBox will not work well behind any sort of NAPT if there are more than
one XBox behind the NAPT.  Don't buy two XBox's in the same home; a quick
search on the Internet reveals that appears to be a common problem with
XBox.  Of course the same problem occurs with large scale NAT44 of any kind
(LSN without in-home NAT, NAT444, and DS-Lite) -- only one subscriber can
use all of their XBox's functionality over the Internet.  

The second half of that XBox error message encourages the user to enable
UPnP.  But of course that won't fix anything with any of the LSN scenarios
(NAT444, DS-Lite, LSN).  Users won't know that, though.


  Nintendo Wii: pass behind one LSN, fail behind another

Would be useful to know what caused that success and that failure, and if it
is specific to NAT444 or common with all large-scale NAT (DS-Lite, LSN,
etc.), or due to UPnP IGD being broken/disabled.


  Team Fortress 2: fail, pass behind one LSN, but performance degraded


Would be useful to know what caused that success and that failure, and if it
is specific to NAT444 or any large-scale NAT (DS-Lite, LSN, etc.), or due to
UPnP IGD being broken/disabled.


   Slingcatcher: fail

Slingcatcher is no longer manufactured by Sling Media, but I am surprised by
the failure because Sling does a *lot* of tricks to deal with NAPT.  (Wish
it was still manufactured by Sling Media, I could use one myself.)  Does a
regular Sling Player also fail (such as a PC connected to the Internet,
attempting to view content on an in-home Slingbox)?  Does Slingcatcher work
without LSN?

  Netflix Party (Xbox): fail, pass behind one LSN

Would be useful to know what caused that success and that failure, and if it
is specific to NAT444 or common with all large-scale NAT (DS-Lite, LSN,
etc.), or due to UPnP IGD being broken/disabled.

  Webcam: fail

This must have failed because of lack of port forwarding, so would be common
to all large-scale NAT architectures (LSN, NAT444, DS-Lite).

  6to4: fail

Does this mean that 6to4 failed to come up, or that attempted IPv6
connectivity was broken because the CPE's WAN IPv4 address is not globally
routable?  Of all the problems listed, this one problem appears to be the
only one that is directly attributable to NAT444, and caused by using a
non-RFC1918 address on the CPE's WAN interface.  Note the severity of this
problem is decreasing quickly as modern OSs have a very low preference for
6to4 addresses (Windows 7,
http://packetblog.wordpress.com/2010/10/03/windows-7-user-you-are-ipv6-capab
le; OSX 10.6.5,
http://lists.cluenet.de/pipermail/ipv6-ops/2010-September/003932.html).

  Teredo: fail

Teredo was designed to work through all sorts of icky NATs.  I run two NATs
at home, and have successfully set up Teredo through both of them, without
doing any port forwarding.  Let's just say that I "have doubt" this test was
conducted properly.


I didn't repeat my comments for Sections 4.2, 4.3, and 4.4 (they are similar
to my comments for 3.*).

-d