SIDR

IETF 85 meeting minutes

 

Introductions and overview

  • Minutes: Wes Hardaker
  • Note: the connection randomly dropped a few times and I lost text because of it...  grr....  Sorry for the holes.

    

Protocol Discussion -- Matt

  • Status overview given (see slides)
  • Updates to -06 came from reviews and from the last interim
  • If others think we should keep something like the additional_info field, please discuss on the list
  • Randy thinks we should keep it
  • Matt will try to create something that will future proof us against bad ideas from <name dropped for protection>
  • General Comments/Discussions
  • ?: I'd like to see implementation before we advance
  •  
  • Hans: Does that mean deployment?
  • No
  • Jeff: One problem with running code before we say its a standard, should we go experimental
  • Wes: we have almost completed a working implemenation, and I think we've heard of others
  • ... missed something because I was standing/sitting ...
  • Can we consider state of what should be stored in the router?  I'm concerned about what it'll entail.  Can we mention somewhere what this might be?
  • Matt: we can consider adding such information, but it may be hard to determine.  There is some things in the sidr-bgpsec-ops and I wonder if that's a better place to put it.
  • True, that's a fine place.  I just want to see it before 
  • Randy: to answer danny, yes, this duiscussion has happened at length
  • Danny: hasn't seen discussion, please provide a pointer
  •  

Local TA Management -- Andrew

  • Summarizing changes and contents
  • We believe this document is ready for WG last call
  • no comments

 

BGPSEC router key roll over -- Brian

  • Status summary
  • no comments

 

Discussion and Analysis of Variants of Key Rollover Method for Replay Protection -- Sriram

  • Describes document and rollover suggestions
  • Shane: Are you modeling strictly the interdomain BGP updates (ebgp only)
  • yes
  • The analysis doesn't account for magnification for explosion internally to those ASes
  • route reflector is not included
  • So this is the best case scenerio
  • yes, slide 10 shows the scenerio...  if you want to change it we can talk about it
  • I'm suggesting that we need to account for the large number of paths within those ASes to get a better analysis of what the large group does to the system at the end of the day.
  • I'd be happy to work with you to come up with other scenerios
  • Danny M: I think this is really useful work to look at this thing.  Thanks.
  • Brian W: I assumed that not-valid certs would be invalid after an update via CRLs
  • In PKR you only roll the transit certs when a topology change happens
  • If event methods are based on certs in the cache...
  • In one instance, you issue a CRL.
  • That's normal PKR
  • Yes, it propegates all the way to the CRL to the router to the cache.  In the other option you use the notvalid time option.
  • Not valid after time is interesting: What you're suggesting is when the two objects are split, you should no longer use this key and you have to do a roll over before that expires.
  • You can propegate a few of those 'next certs' ahead of time
  • Doug: <missed it>
  • Shane: there is memory overhead and CPU on every router
  • <editor lost connection>
  • if your concern is there is additional memory overhead, your concern is the footprint tomorrow is going to be worse?  And you want to know what that is
  • Shane: correct.  I want to know what the overhead is going to be.  We know it will be, but i want to know what beaconing and other things will do to the number of updates, etc.  What additional load is placed based on BGPSEC with key-roller, etc.  I want to know what that difference is.
  • There is data on slide 8
  •  
  • (07:02:17) Randy Bush (all):
  • brian is right
  • ops doc says advance MUST be done
  • extra announcements do not take more memory, they replace old.  but indeed if we wanted to go down beaconing etc, much more real modeling should be done
  • Jeff: Assuming it's about that order of mag, every egbp speaker is going to have to do more work.  that's a given.
  • I think the overload of crypto processing is on ebgp only
  • jeff: work happens at the edges, that's the primary pain point if we do validation primarily at the edges.  It would be a flood distribution rather than a convergence event.  The memory wouldn't change much, but the CPU power would.
  • Eric O: if I reintroduce an update in the system without knowing that complexity, then it could cause a lot of other people resources (resigning).  You get amplification.
  • I appreciate the comment, but we should talk one on one to make sure what you want is included
  • that's fine
  • John S; one more point on ibgp: if his number in ibgp is 15x, then you could take that graph (8) and multiple by 15 and you'd be done.  Just the Y axis would change?
  • Yes, if 15 is the expected change then it would
  • john; note this is a log scale on that line
  • Danny: this is a fundamental change to BGP.  The amount of churn because of periodic updates means lots of churn.
  • Randy: but beacon discussion is whack-a-mole.  we decided against it.  done, gone, dead.
  • We are not redoing beacon; this is different
  • This is a historic discussion of previous analysis

 

AS Migration -- Wes George

  • moved to 11:20 session

 

Multiple Publication Points -- Carl

  • Rüdiger: at least one thing seems wrong...  the threat is not that repositories present the data in an incomplete fashion, but that the CA can withdaw or insert information and there might be inconsistent data in the repositories.
  • i don't think this is a solution for this problem; it provides a comparison
  • the comparison is already taken care of by the manifest
  • only if you've been fetching all along; not if you're a newcomer
  • CA operations need transparency
  • it's definitely not a solution
  • new: if you have one copy in NA, and one in SA you can still get one or the other
  • Andrew: lets not talk about black helisLets talk about high-avail.  What does this buy you that you can't get in other ways.
  • it allows you not to use DNS if you don't want to.  Right now you do round-robin DNS or load balancer.  It's an additional tool, not a solution.  Just a different way of high availability.
  • Tim: I don't think this is necessarily the only way to go for this.  I think having multiple publication points is useful, and this gives us flexibility.
  • Randy:  either the repositories are inconsistent or this does nothing for the helicopters.
  • I'm not claiming this is a solution for black helecopters.  Obviously validation will break, but you still have 
  • Sam: I'm unconvienced
  • Rob: I'm not a fan.  You have the same problems with setting up mirrors, the same problems with repositories.  you have to check multiple places, and decide which to use.  If you check them all, they'll all be happy.
  • It's not the idea you need to check them all.

 

 

PQ of the CP -- Andy Newton

  • Sean Turner: I've written some CPs.  Who's going to be looking at the pointer to go figure this out?  The router?  Web browser are completely different.  if you're talking a managed service, do you not know who you're contract with?  We should keep certs small.  This is built off of 5280.  it should just be an oid.
  • Our requirement was we had to have a pointer to our CPS
  • Russ: he was way too polite.  I've been arguing for a must-not-use certificate policy qualifiers.  They're harmful.  Nothing here needs one of these.
  • Danny: ?
  • Nobody is going to process it, so you're not communicating anything to a relying party
  • I do know where is legislation requiring this.  We didn't decide to do it, it was a legal requirement to do it.
  • Steven: Sean's right

 

Validation of RPKI - Tim

  • Randy (who typed it without caps, of course): I do not think anyone has problems with not using objects not in manifest.  The requirement to do so just comes from the document, which can be changed.  This has little to do with the transport.
  • I'm not sure I understand the comment
  • Rob: i think randy meant there is no techincal reason to search outside the manifest.
  • Randy: I was agreeing with you Tim
  • Andrew: I also agree it would simplify implmeentation if we said that RPs aren't requried to look at everything in the manifest.  It doesn't mean you can't use it to look in.  You can use it for fallback.  Can you use manifests as your discovery mechanism.
  • Carlos: I like this idea.  Using the manifest in this way lets us break free of rsync recursive fetch
  • Rob: as an RP and CA implementor, this would make my life simplier.
  • Andrew: at the moment, I believe the SKI and AKI is a sha1 hash.  There were not meant to be a security benefit.  They were a locator.  is that a problem?
  • Tim and Rob: no, it's just for discovery.  you still need to check the signatures.

 

Updating on RPKI Validator Testing -- Andrew

  • Danny: have you done anything on modeling today's routing system and modelled with routing keys?  IE, a couple of million objects.  What would a fully loaded system would look like?
  • We haven't yet.

 

RPKI origin validation in real life -- Carlos

  • Shane: what was the fetching interval?
  • It's the validators default behavior.  By default it's every 4 hours.
  • Russ H: we're running validation here, and I'm looking forward to a presentation [on the results].  The challenge I told them here was to have origin validation up and running for orlando, but they found it easy and got it up and running immediately.  But we couldn't get a certificate because they only run PA space and we have PI space.
  • Rüdiger: ?
  • it didn't make much of a difference.  everything gets routed by default anyway.  With IPv6 it was different because we dropped about 800 routes and there was no default so they didn't get out.
  • if aggregates exist, the trafic could flow to the correct destination.  But more specific routes would be given for traffic validation, etc, and it would be interesting to see if it had impacts on latency, etc, because of some routes drop.
  • True, it should be researched
  • Tim: highest priority should be certifying more space.  We try to help the users in our ssytem by comparing ROAs but it's up to the user to do this.  Data quality is a concern.
  • Sandy: at the ARIN meeting the RIPE experience with their users and their attempt to assist their user was suggested as a posible enhancement to their tool
  • Doug: first order simple measurement would be to look at the paths
  • Randy: note that the ietf noc did not have the guts to drop invalid.  @shane, do you intentionally break your running network?
  • Wes: happy eyeballs states that if the v6 routes were dropped the v4 would be the fallback and that had a default route

 

Origin validation at the IETF -- Warren K.

  • About two weeks before the IETF we were asked to do it, so we started immediately
  • [slides]
  • We were talking about this during the break and you mentioned something about traffic going to invalid routes.
  • We were wondering, if ew dropped invalid routes would it affect anything?  Figuring that out was a bit tricky.  We thought about running netflow, but had disk space issues.  We thought about building a firewall out of the results and then having the counters increment based on the rules. what we saw, the huge majority of the traffic comes from unknown prefixes.   We did get some traffic from valid (4m/s) and there was some invalid (100k/s).  We decide conclude if we dropped some invalids we would have affected users.
  • Carlos; what were you using?
  • The rpki.net software.  It was being run out of cron.

 

AS Migration -- Wes George

  • Wes: Caveat: i'm not the expert on this.
  • Randy: But does not the solution sandy presented in amsterdam solve this?  Do you know of problems with sandy's approach?
  • I think it does, but I need people to confirm that
  • Shane: for modeling done it's useful to consider effects of having multiple ASes persist indef. within carrier networks.  The world is not as clean as a single AS.  For larger operators, you're looking at a done of these ASes persisting within a carrier indefinitely going forward.  That should be noted in the key-rollover
  • Wes; it's basically there.  We can't recommend finishing a migration as quickly as possible. [miss]
  • Shane: I agree.  it would be useful to expand a little bit on that path beyond the obvious two that we put in there, so look for another updated before the next IETF

 

Migratory ASs -- Sandy

  • Note: This is an approach that is an amalgamation of various discussions
  • Jon: Right now the draft says informational not standards track.  Does it matter to you which?  It seems like the tricks and the knobs work no matter what
  • Sandy: No opinion

 

BGP L3VPN Origin validation 

  • Chair comment: don't freak out about signing with shared private keys
  • I'd really encourage this WG to decide this is a problem and whether these situations are common.
  • Sandy: i'd like to comment that this work is not being brought to this WG to become a sidr item, it was brought because we might provide useful advice.
  • This problem does exist and can happen.  as an L3 service provider you can definitely increase your reach by using an L3VPN.
  • Brian: you're suggesting using the ? for the keyid.  That's long lived, I suspect you want to use something different for key-rollover purposes
  • I agree
  • Mike: This may have big impacts on the way providers connect to large providors.
  • How?  [discussion cut where understanding didn't happen] Lets purse it offline.
  • Randy: this is not the wg to decide if l3vpn folk need this or if it makes sense & the complex scenarios were because a frequent poster asked what if 300 times.  we have measured update unpacking impact in bgpsec.  it has effect, of course, but not extreme
  • I get concerned when we use lots of different things from previous tools.  Then we'll need to provide interworking documentation between this and BGPSEC, for example.
  • John S: You've been careful to talk about misconfiguration.  Is that the only threat you want t oaddressor are you concerned about deliberate attacks too.  This is vulnerabile to replay.
  • Yes; that's noted in the draft as well.

 

Interjection due to time about something - Rob

  • Huge change after a flat repo to hierarchical model.  Huge difference in sync time.

 

Relying party agreements -- Sandy

  • Randy: Des what john says today or tomorrow have effect on the signed paper?  And when he says the opposite next week?
  • John C: Basically, we do not want the ARIN RPA to be in the way of any reporting, summary, or statistical purposes, presuming that the incorporated data is not readily machine extractable in the resulting works.
  • Sandy: it would be easy to include and approve
  • Warren: I think what it says is that it should be easy but please ask us first.
  • Joel: I appreciate what John is saying, but John's messages aren't legally binding.
  • Sandy: I'd also be comfortable signing something that says the message here is that there is a potential for impact on the architecture if agreement doesn't come.
  • What's interesting: rpki consists of many CAs.  Once we have one trust anchor, it's not clear how this statement holds.
  • John Curran: We can increment the RPA accordingly.
  • Doug M: do you have any idea is simply a validation that requires.  is that validating the X.509 or using the output to validate ROAs.
  • Sandy presumes validating the information you receive
  • Doug: as an example if I publish a list of routes but didn't check the X.509 validation...
  • Sandy: the global trust anchor is still under consideration by NRO and ICANN
  • John Curran: This is precisely why it needs to be in writing, to make sure we are very clear on what is in scope vs what is not in scope of such republication restrictions.
  • Sandy: stay tuned