Re: Last Call: <draft-ietf-sidr-rpki-rtr-19.txt> (The RPKI/Router Protocol) to Proposed Standard

Terry Manderson <terry.manderson@icann.org> Thu, 22 December 2011 01:43 UTC

Return-Path: <terry.manderson@icann.org>
X-Original-To: ietf@ietfa.amsl.com
Delivered-To: ietf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D297C11E8096 for <ietf@ietfa.amsl.com>; Wed, 21 Dec 2011 17:43:27 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -106.599
X-Spam-Level:
X-Spam-Status: No, score=-106.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wMgETi9-4+cB for <ietf@ietfa.amsl.com>; Wed, 21 Dec 2011 17:43:27 -0800 (PST)
Received: from EXPFE100-1.exc.icann.org (expfe100-1.exc.icann.org [64.78.22.236]) by ietfa.amsl.com (Postfix) with ESMTP id 150DF11E808C for <ietf@ietf.org>; Wed, 21 Dec 2011 17:43:27 -0800 (PST)
Received: from EXVPMBX100-1.exc.icann.org ([64.78.22.232]) by EXPFE100-1.exc.icann.org ([64.78.22.236]) with mapi; Wed, 21 Dec 2011 17:43:26 -0800
From: Terry Manderson <terry.manderson@icann.org>
To: "ietf@ietf.org" <ietf@ietf.org>
Date: Wed, 21 Dec 2011 17:43:23 -0800
Subject: Re: Last Call: <draft-ietf-sidr-rpki-rtr-19.txt> (The RPKI/Router Protocol) to Proposed Standard
Thread-Topic: Last Call: <draft-ietf-sidr-rpki-rtr-19.txt> (The RPKI/Router Protocol) to Proposed Standard
Thread-Index: AczALxCEoHMetNq7d0mRESzazgktXgAHAafv
Message-ID: <CB18C85B.200EE%terry.manderson@icann.org>
In-Reply-To: <CB179C36.11F43%keyupate@cisco.com>
Accept-Language: en-US
Content-Language: en
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ietf>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 22 Dec 2011 01:43:27 -0000

Hi,

Apologies for my lack of attention to date on this topic, so speaking only
for myself here.

At this stage I do not support this document's publication.

Starting with the document structure, I see no reference to a set of
requirements. The introduction is rather vague, and if anywhere that is
where I would expect to see such a requirements description. This means for
the rest of document I found myself asking "why" on many levels.

When I got to the end of the document I felt that the protocol borders on a
wheel re-invention exercise. When you think about a router simply being a
client to a cache that is providing RIB access tokens for a route using a
mechanism that is a secure, stable, scalable, known (by both vendors and
operators), and is extensible, I'm more likely to swing to RADIUS in doing
such a service with nicely structured AV-Pairs and sane timers for
reauth/retry etc. Even the SME's know radius for their WPA enterprise kit.


Glossary:

Global RPKI: 
I disagree with this definition for two reasons. 1) I'm not aware of a
unified definition for 'distributed system' so this is all rather vague.
Perhaps you could say 'published at a disparate set of systems'. 2) Limiting
the servers to be "at" the "IANA, RIRs, NIRs, and ISPs" is also premature.
It's not clear to me that these entities will run their own repositories,
nor are they going to be the only repository operators in the lifecycle of
the RPKI.

Cache:
The words surrounding the fetch/refresh mechanisms of the RPKI is limiting.
Both draft-ietf-sidr-repos-struct and draft-ietf-sidr-res-certs allow for
other (future) retrieval mechanisms as defined by the repository operator
beyond RSYNC (loosely documented in RFC5781).

Last sentence. "Trusting this cache further is a matter between the provider
of the cache and a relying party". In my mind the Relying Party was the one
that did the RPKI validation - would this not be better stated as "Trusting
this cache further is a matter between the provider of the cache and the
router operator".

Deployment Structure:

Why repeat the definition of "Global RPKI"? It's superfluous.

Local Cache: Again. 'Relying party' seems to be borrowed from the
CA/identity world. Unless you redefine that term here it seems as if the
"router" is making RPKI validation decisions. Which it is not. The router is
acting more like a NAS (See Radius, 2865) when talking to a local cache.

The definition of "routers" seems to get this right - eg "a client of the
cache".

Operational Overview

when you first use "ROA", please expand the TLA, and provide a reference.

Serial Query

I don't remember seeing a recommendation for how often a client (router)
sends a serial query. Is there a Min/Max? Surely doing it every second would
be excessive..

IPv4 Prefix:

"and nothing prohibits the existence of two identical
   route: or route6: objects in the IRR."

Why even mention the IRR here? It just doesn't seem at all relevant. (and
isn't defined)

" IPvX PDUs" expand to IPv4 or IPv6. Globing into one is a misdirection
under a heading of 'IPv4 Prefix'

IPv6 Prefix

Some text here to say that the IPv6 data structure follows the same
semantics as the IPv4 data structure would be good.. or alternatively
restructure the document to Semantics, then describe the IPv4 and IPv6 data
structures as subheadings to Prefix PDUs.

Error Report

What is "excessive length" of a PDU? at what point do you say "o.k, now I
can truncate".

Fields of a PDU

For all types, instead of using "ordinal" can you use the exact description
of the number? eg unsigned integer? For me I always relate ordinals to set
theory.


PDU type, the e,g is incomplete shouldn't it be "IPv4 Prefix = 4" with a
forward reference to the IANA Considerations section?

Serial Number. "for example via rcynic", Is not defined and implementation
specific! and there is a typo "completing an rigorously validated"..while
there, consider why you use the term 'rigorously'.. are there situations
when a validation is less rigorous? If so explain.

Session ID

What is the risk of a cache server starting/restarting with the same session
ID and serial number as before, but with different cache contents? Is this
an entropy concern? Just thinking of a potential scenario where a router is
cache-wedged. Is this at all probable? and why not - some words here to
cover this would be good.

Flags

Can you reword the binary choice here? Do you actually need to delve into
'right to announce'? This is really about RIB entry behaviors yeah?

Expand "IPvX".


Start or Restart:

I think the terms in when a router needs to send a serial query or a reset
query need to be tighter. Saying MAY here is too loose. I would much prefer
to see a structure where if the router does not have a recorded serial for a
cache from a previous session, the router MUST send a reset query. Logically
you assume that to be the case, so be specific.

Thereafter the router MAY send a reset query, and SHOULD send a serial
query. I suspect this is what the vendors (who have chimed in on the list)
have coded.

This then corroborates section 4 where you suggest the router only send
serial queries for efficiency.


Transport:

MiTM is Man in the middle as I and many others know it. 'Monkey/piggy/pickle
in the middle' is a child's ball game.

" Therefore, as of this document, there is no mandatory to
   implement transport which provides authentication and integrity
   protection."

if this is the case.. then why? what is the gain? why not then make the
router fetch the signed objects and do the validation internal - this again
seems to be the 'missing requirements' problem. And in such events - there
are a million solutions.

SSH Transport

State up front that you MUST use SSHv2. (instead hinting in the third
paragraph)

TLS Transport
"Man in The Middle (MiTM)" please.

Router Cache setup

"When a more preferred cache becomes available, if resources allow, it
   would be prudent for the client to start fetching from that cache."

How does the client (I assume router) know when to do this as cache's are
not synchronized?? How does a router tell if any particular cache has more
current data over another cache? what if two caches contradict each other?

Error codes

6: Withdrawal of Unknown Record (fatal), why drop the session? (which
presumably causes a restart) to a cache, assuming the cache is corrupt,
which will then send another Unknown Record, which is fatal... (repeat)??

Why not mark the cache as corrupt at the client?


Security Considerations:

Transport Security. There are multiple valid options for a root trust anchor
including the structure from the IAB aligning it to the IANA. Perhaps
instead of saying " the IANA root trust anchor" say "Global RPKI root trust
anchor". Otherwise you might accidently find your validated cache only
covers unallocated and reserved blocks.

Cheers
Terry