Re: [v6ops] [GROW] Deaggregation by large organizations

Christopher Morrow <christopher.morrow@gmail.com> Thu, 16 October 2014 15:20 UTC

Return-Path: <christopher.morrow@gmail.com>
X-Original-To: v6ops@ietfa.amsl.com
Delivered-To: v6ops@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 917741A1BF4; Thu, 16 Oct 2014 08:20:54 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.6
X-Spam-Level:
X-Spam-Status: No, score=-0.6 tagged_above=-999 required=5 tests=[BAYES_05=-0.5, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1G7Bp6ewaNpx; Thu, 16 Oct 2014 08:20:50 -0700 (PDT)
Received: from mail-lb0-x229.google.com (mail-lb0-x229.google.com [IPv6:2a00:1450:4010:c04::229]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 096AE1A1BC0; Thu, 16 Oct 2014 08:20:49 -0700 (PDT)
Received: by mail-lb0-f169.google.com with SMTP id 10so2987968lbg.14 for <multiple recipients>; Thu, 16 Oct 2014 08:20:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=0z89qqxScjReCiBUtpWo2m1Ear8fCavMT5ERJCQBDG8=; b=APu+EcIntC06Lsdc+mskwVrdOO3YzvuRE1XILCauxRWX4v/sUWGVuiL77HwQFDdpkW tYWLaf3qXRaWht8ckCYFc6BXGGEvpZ3HzAq3dGyT6SqkIO5BUBXPAeuCUul5PIQOKq99 CwiCmqHXa5D8QIyq2UDL33Z5pMV//sIhOsf6+bK/zbfy4LEAooAHbVg/hfcH0q4I0kEX xEnpedxLKy2lKtG35qB+cNySyefjLW2o5nwGNtFnfCLLIY4kDkCyxls0b0FlJvMMFtPl nc6VNHIlvoxuX3LyGazkaO0DnKCQQrp50YBk7ihN+DMbnxVstFTcIZXddYWKUmh0hUCg WTKg==
MIME-Version: 1.0
X-Received: by 10.153.11.133 with SMTP id ei5mr2291514lad.75.1413472848218; Thu, 16 Oct 2014 08:20:48 -0700 (PDT)
Received: by 10.152.88.17 with HTTP; Thu, 16 Oct 2014 08:20:48 -0700 (PDT)
In-Reply-To: <755DE4C3-CDDF-41AF-BA9C-E8EC5B4DFC4C@muada.com>
References: <F5C06CAF-0AD2-4225-8EE7-FC72CE9913F0@muada.com> <755DE4C3-CDDF-41AF-BA9C-E8EC5B4DFC4C@muada.com>
Date: Thu, 16 Oct 2014 11:20:48 -0400
Message-ID: <CAL9jLaY3yzrrk5a79kVn2MVxrP7zQyo-Dotrz7BXmB+csE=Vig@mail.gmail.com>
From: Christopher Morrow <christopher.morrow@gmail.com>
To: Iljitsch van Beijnum <iljitsch@muada.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: http://mailarchive.ietf.org/arch/msg/v6ops/2w6utCLJCxUlvESqYxPhleNO4zw
Cc: IPv6 Operations <v6ops@ietf.org>, "grow@ietf.org grow@ietf.org" <grow@ietf.org>
Subject: Re: [v6ops] [GROW] Deaggregation by large organizations
X-BeenThere: v6ops@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: v6ops discussion list <v6ops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/v6ops>, <mailto:v6ops-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/v6ops/>
List-Post: <mailto:v6ops@ietf.org>
List-Help: <mailto:v6ops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/v6ops>, <mailto:v6ops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Oct 2014 15:20:54 -0000

(apologies for again not reformatting apple-mail->gmail issues)

On Thu, Oct 16, 2014 at 5:43 AM, Iljitsch van Beijnum
<iljitsch@muada.com> wrote:
> Let me address a few points that were brought up by different people.
>
> Renumbering:

I think the renumbering ship sailed, and like the titanic had a bad
voyage. You seem to agree.

> A prefix length limit for the IPv6 DFZ:
>
> Someone mentioned that this didn't work in IPv6. When Sprint decided to make that /18, that didn't really work. But there's a de facto /24 limit that everyone understands.
With IPv6, that would translate into a /48. Obviously no router can
hold 2^48 or 2^45

I suppose there's a question about: "Is a /24 a LAN (/64) or a SITE
(/48)" to be answered still. I believe SITE works better, but :)

prefixes, so as a backstop against accidental/malicious IPv6 routing
table explosion this doesn't help. Even exploding a /28 or so into
individual /48s would kill the IPv6 DFZ.

mostly this is from motherhood + apple-pie bgp neighbor configs
though, right? (max-prefix in particular).

> What COULD work is to have prefix length limits depending on the allocation size by the RIRs. Something like:
>
> 2100::/16 -> /48
> 2200::/16 -> /32
> 2200::/15 -> /29

This looks, to me, like the 'PA space is >= /32, PI >= /48' in each
region's PA/PI splits... which ends up being fairly stable and fairly
simple prefix-list/etc configs on devices, right? so that seems kind
of sane even. I had thought the cymru folk had something like this in
their templates:

<https://www.team-cymru.org/ReadingRoom/Templates/IPv6Routers/junos.html>

isn't what I was looking for, sadly...and is the only sort of ipv6 /
route-filter related thing I quickly find on their site... perhaps
some room for improvement in this area exists. Should that be an IETF
action or 'get the defacto standards/bcop folk to update their docs
based on the best shared wisdom' ?

>
> However, for this to work well the RIRs would have to group allocations of the same size into separate blocks, with the result that it would no longer be possible to reserve space to grow an allocation. (Things like allocating a /48 but reserving a /44 reduce the opportunities for prefix length filtering because now the strictest filter you can make allows 16 x as many prefixes worst case than average case. The worst and average case need to be as close together as possible.)
>

my impression (and I stopped paying close attention to ARIN at least
2+ yrs ago) was that RIR's were allocating PA from one large (/23?)
block per RIR and PI from a different...

For my work-location-provider:
  2620:0:1000::/40 - PI /40
  2001:4860::/32 - PA /32
  2607:F8B0::/32 - PA /32

<http://www.iana.org/assignments/ipv6-unicast-address-assignments/ipv6-unicast-address-assignments.xhtml>

doesn't delineate the difference between 2001:4800::/23 and 2600::/12
and 2620::/23 ... bummer. I suppose this though:
  <https://www.arin.net/knowledge/ip_blocks.html>

is meant to delineate the differences as 'allocation blocks' vs
'assignment blocks'... (I didn't search the same sort of stuff out on
ripe/apnic/etc...)

> I'd say that allowing two or three extra bits for traffic engineering for PA blocks would be good. So for the part of the IPv6 space where /29s are allocated, allow /31s or /32s. As traffic engineering incoming traffic by deaggregation requires that different parts of the aggregate all generate similar levels of incoming traffic, this wouldn't usually work for organizations using PI so I'd say don't allow deaggregating below /48.
>

sure, this gets into the above 'give me a default understanding of the
iana -> rir ranges + RIR purposes (alloc/assign) and add +N bits for
TE. It's not quite answering the other part of your question about:
"What will the DE gov't do? how will they run their new shiney network
such that they don't have reachability concerns?"

> Geographic communities:
>
> I know this is controversial. "Topology ain't geography". Actually, most of the time there is a significant correlation. If all German cities inject a more specific, do you really need to hear those in Tokyo or Seattle? Just send the traffic to Europe as per the aggregate and let them figure it out there.
>

this argues for the DE gov't having 1 ISP's for all their cities and
putting the aggregate into BGP as well with the more specifics
no-export'd from the ISP, not for communities, which won't reliably
make it across AS boundaries.

> Compiling a list of communities that identify regions/countries/cities would allow for experimentation in this place without any downsides that I can see. Don't like this? Filter the communities. There's a handy list that you can copy and paste into your filter.
>
> Injecting an aggregate as a point of last resort:
>
> I think this can be done today and probably is done today. But a document describing how to do it would probably be helpful. I'm thinking along the following lines:
>

ok

> The AoLR (Aggregate of Last Resort) service would entail a service provider announcing the aggregate without necessarily providing connectivity towards all the places announcing more specifics covered by the aggregate. So if ISP A announces the AoLR and ISP B provides connectivity to a more specific, ISP C would send traffic to A as per the aggregate and then A would immediately hand it over to B.
>
> So as part of the AoLR service, a service provider would agree to accept all more specifics that fall under the aggregate (up to an agreed prefix length) from all the networks providing connectivity towards those more specifics. This would be an attractive service for tier-1s to provide, because presumably, they peer with everyone everywhere, so in the case where they receive the traffic over peering and need to deliver it to another service provider over peering, this could probably happen in the same city, so they wouldn't carry the traffic over long distances. But the (sub-)organization(s) in question still gets to buy connectivity from a wider range of smaller service providers.
>
> In practice an organization would contract two or more service providers to provide the AoLR service for redundancy.
>

this doesn't sound unreasonable, make txt pls appear in draft form,
then email that to grow@ and see what debate happens there.

> Wouldn't they just get PI:
>
> Yes. That's why I think it's important to find a way to give these organizations what they need in a way that keeps the IPv6 DFZ growth on a workable trajectory.
>
> AS numbers:
>
> BGP assumes that an AS always has internal connectivity. This can be accomplished using tunnels, but it's much better to simply have separate AS numbers for each subunit. Would it make sense to allocate ranges of AS numbers to enterprise LIRs? Certainly with 32-bit AS numbers there's no lack of numbers, and this would allow tools to be developed to work on CIDR-like AS number ranges in the future.
>

it seems like the AS here is really not relevant, unless you were
thinking that: "AS is a proxy for GEO data." (or AS == SITE) One
thought is that AS provides the mapping to 'authoritative entity for
the routing policy surrounding/using the prefixes originated by the AS
in question', so is it simpler to remember: "DE govt is AS65534" or
"DE gov't could be one of 160 AS numbers" (a range might help, but
ranges imply some idea about growth in the future. Would you have
planned for RU to add Crimea?)

The AoLR (or 'how to be an enterprise that participates in the global
routing system') seems like a good start though.

-chris