Re: [Tsvwg] WGLC for Port Randomization starts now (April 1st)

Mark Allman <mallman@icir.org> Wed, 15 April 2009 03:32 UTC

Return-Path: <mallman@icir.org>
X-Original-To: tsvwg@core3.amsl.com
Delivered-To: tsvwg@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id DF41B3A685C for <tsvwg@core3.amsl.com>; Tue, 14 Apr 2009 20:32:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.506
X-Spam-Level:
X-Spam-Status: No, score=-2.506 tagged_above=-999 required=5 tests=[AWL=0.093, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id b7rqN4eItgYC for <tsvwg@core3.amsl.com>; Tue, 14 Apr 2009 20:32:06 -0700 (PDT)
Received: from pork.ICSI.Berkeley.EDU (pork.ICSI.Berkeley.EDU [192.150.186.19]) by core3.amsl.com (Postfix) with ESMTP id C0A243A681A for <tsvwg@ietf.org>; Tue, 14 Apr 2009 20:32:06 -0700 (PDT)
Received: from guns.icir.org (adsl-69-222-35-58.dsl.bcvloh.ameritech.net [69.222.35.58]) by pork.ICSI.Berkeley.EDU (8.12.11.20060308/8.12.11) with ESMTP id n3F3XF0b007978; Tue, 14 Apr 2009 20:33:17 -0700
Received: from lawyers.icir.org (unknown [69.222.35.58]) by guns.icir.org (Postfix) with ESMTP id 14CA9394DDA3; Tue, 14 Apr 2009 23:33:10 -0400 (EDT)
Received: from lawyers.icir.org (localhost [127.0.0.1]) by lawyers.icir.org (Postfix) with ESMTP id F00C0CD585E; Tue, 14 Apr 2009 23:33:07 -0400 (EDT)
To: "James M. Polk" <jmpolk@cisco.com>
From: Mark Allman <mallman@icir.org>
In-Reply-To: <XFE-RTP-201eWYXI4Ht000037b9@xfe-rtp-201.amer.cisco.com>
Organization: International Computer Science Institute (ICSI)
Song-of-the-Day: Given to Fly
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="--------ma21872-1"; micalg="pgp-sha1"; protocol="application/pgp-signature"
Date: Tue, 14 Apr 2009 23:33:07 -0400
Sender: mallman@icir.org
Message-Id: <20090415033307.F00C0CD585E@lawyers.icir.org>
Cc: tsvwg <tsvwg@ietf.org>
Subject: Re: [Tsvwg] WGLC for Port Randomization starts now (April 1st)
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
Reply-To: mallman@icir.org
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tsvwg>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Apr 2009 03:32:08 -0000

> This is the start of the WGLC for Port Randomization, which will
> last for 14 days - starting April 1st, going through April
> 15th. Please review this document and post comments to the TSVWG
> list and authors.

I read the document today.  Comments:

  - The title is bogus.  The prose carefully says that the document is
    not about port randomization, but about port obfuscation.  The title
    should be corrected.

  - While the early part of the document does talk in terms of
    obfuscation and not randomization the entire document needs scrubbed
    because there are plenty of places where the document discusses
    "randomization" when it really means "obfuscation".  I flagged a
    bunch of these but am not including them in this email because there
    are just too many.  I suggest searching for "random" and fixing all
    the bogus instances.

  - Section 1: "tihs" --> "this"

  - Section 1: "produce a random" --> "produce a mathematically random"

  - Section 2.1: There is a claim that the ephemeral port space is
    "traditionally" from 49152-65535.  I think it is more traditional
    for hosts to use monotonically increasing ports starting at 1024.
    Even if hosts do use this upper range they also use this lower range
    a bunch.  I'd nuke this statement as probably wrong, but at least
    dubious.

  - Section 2.2: suggest tacking "at a given point in time" to the end
    of the last sentence.

  - In 2.2 and several other places the notion of a 'blind' attacker is
    stretched.  I.e., the document discusses attackers that cannot see
    some particular connection directly but can view some other, perhaps
    related, communication with the host.  It might be nice to see this
    developed when the concept of a 'blind attack' is developed because
    it isn't quite blind and it feels a little funny being introduced in
    the middle after we have been told the document deals with blind
    attacks.

  - In 2.3 you might note that an alternative approach is for both hosts
    to keep state about recent connections.  I am not arguing it is a
    good approach or a bad approach...just an alternate approach that
    would work towards minimizing the collision rate.

  - In 3.1 collisions are discussed as "interoperability problems".  I
    don't think that is quite right.  I think the real issue is that
    there is added delay in getting things setup.

  - I found the notion of a bit array in 3.2 to be way too much detail.
    Just note that a host could introduce a system to track which port
    numbers are going to be used specifically by applications and leave
    it at that.  The bit array isn't the Clear Right Answer so just
    leave the data structures out.

  - I think the notion:

      Although carefully chosen random sources and optimized five-tuple
      lookup mechanisms (e.g., optimized through hashing) will mitigate
      the cost of this verification, some systems may still not want to
      incur this search time.

    I remain of the opinion that this is just used-car-salesman talk.
    Hosts have to be able to associate incoming packets with connection
    state on every packet arrival.  That is the same kind of lookup they
    have to do to check whether a chosen connection-id is in use.  The
    above is simply FUD.  Trying a couple of these connection-ids is in
    no way an onerous task.  I strongly suggest this sentence be
    removed.

  - Further in 3.3.1 you note that web proxies and NATs are examples of
    systems that "create many connections from a single local IP address
    to a single service".  I think that's pretty dubious.  You might say
    that they make more connections to popular services than end hosts
    do (because of the aggregation) and thus increase the population of
    used ephemeral ports and hence the chance of collision using Alg. 1
    or 2.  But, I think it's sort of dubious to just leave it hanging as
    these things hit the problematic case as a matter of course, which
    is not generally true, I bet.

  - Again, the paragraph under "figure 3" is dubious.  First, it is
    predicated on the notion that lookup time is somehow consequential,
    which is pretty off-the-mark as I noted above.  Second, the data we
    have suggests that there are *less* port choices required to handle
    remote collisions with Alg. 2 when compared to Alg. 1.  While that
    is not exactly what is being discussed in this section it seems like
    we can use remote collisions as a proxy here for what the local
    collision rate likely looks like.  (See table 3 in [Allman].)  And,
    in fact, going a step further nearly all the collisions are taken
    care of by two port choices.  So, it's hard to believe that this
    playing up of the possibility of some arduous search through the
    port space is really warrented.

  - Perhaps I missed it, but it would be nice to define the "traditional
    BSD algorithm" early in the document.

  - Section 3.3.4: "that even a" --> "that a"

  - Section 3.3.4: "systems recommend" --> "systems we recommend"

  - Section 3.3.4: It'd be nice to see some reasoning about why a table
    size of 1024 is being recommended.  It seems like this is
    arbitrarily pulled out of the air.  I think it'd be fine to leave
    this to the implementer.  I.e., note that [Allman] shows 10 entries
    will work fine for collision avoidance and increasing the table size
    increases the obfuscation and leave it at that.  If you pick a
    number then at least motivate it somehow.

  - 3.3.5: "introduced yet another" --> "introduced another"

  - 3.3.5: The algorithm is wrong.  This is my fault.  The paper is not
    clear.  Alas.  When a collision is found we do not increment by one,
    but choose a new random increment.  This is just not discussed in
    the paper, but should have been.  However, I looked at the code for
    my simulator and that is how I did it.

  - 3.3.5: I'd like to see more prose about algorithm 5 to mirror the
    other sections.  I.e., to give the intuition behind it, etc.

  - 3.5: You should likely add the caveat to our results that goes "in
    the two environments assessed" or something like that.

  - I think it's slanted that you sort of report results for Algorithms
    1 and 2, but not for the remainder of the algorithms.  The "sort of"
    being that you just say "they were the worst".  We have numbers
    here.  Let the data do the talking.

  - And, if I had one request about data it would be to note that no
    matter what algorithm is chosen the fraction of hosts we saw having
    *any* collisions is exceedingly small.  I.e., the collision problem
    is really a non-problem for most cases.  And, so it might be
    reasonable to use something dumb and simple in the general case and
    a little more expensive in a less general case.  At least that's how
    I think the data really comes down.

  - [Allman] should be:

      Mark Allman.  Comments On Selecting Ephemeral Ports.  ACM Computer
      Communication Review, 39(2), April 2009.

That's a pretty long list.  But, I don't think there is really anything
difficult in there.  In general I think the document is in pretty decent
shape, but I would like to see most of the above addressed before the
document moves forward.  FWIW.

allman