IPSECME Meeting Notes

Note taker: Jim Schaad. All remaining errors by Paul and Yaron.

Agenda bash & Status
- IPsec is due to change from MUST to SHOULD in the new IPv6 node requirements doc.
- Ongoing discussion on using CGA for authentication, some aspects relevant to IPsec.

HA Recap slides
   People should read McGrew's msec counter mode document, now in IETF last call. [Pointer sent separately on the list.]

   RFC5297 - Auth encryption mode is another solution (a different block encryption mode).

HA Solutions slides

   Steve Kent - cluster member taking over shoves done  and peer response as needed.
   Paul - Each SA is synced up independently.
   Steve - Should be indication of knowledge of more SAs to be synced later
   Raj - "More" flag indicates that more SAs are coming
   Paul - Nothing about what happens if "more coming" flag is not set
   Raj - peer can initiate for secondary SAs (?)
   Dave McGrew - How is a replay attack possible? [offline commentary: ESP replay is in principle equivalent to replication of packets by the network, and thus not an attack; however the draft attempts to preserve existing ESP security guarantees.]
   Raj - Lost IKE SA so counters can be reused - must fit in IKE window
   DM - ESP and AH - possible attack here?
   R - After failover happens can safely stop traffic
   DM - expand
   R - Active member sync w/ standby - counter =1700 = now counter is 1800 - standby starts from 1700 (last sync) looks like replay attack to peer
   Yoav - Sync counters every 1000 must - peer can now resend message 1001 again after failover since does not know that it exists.
   DM - situation where cluster is not following RFC [each gateway is compliant, but the cluster as a whole is not].
   SK - cannot go 100% HA - could toss on floor or accept possibility of replay
   Yoav - could send one message to skip forward on all IPsec SAs, then peer just jumps forward and can do all SAs at the same time.
   R - different data traffic on SAs  - so possibly different traffic rates
   Paul - just not full IKEv2 - similar to concept of your other side might have forgotten something and you need to re-sync - not currently specified.

   Tero – there is a real replay attack remaining, rate limiting does not help at all - replay counters where set back anyway - peer should know better as window is not matching well. This is my counter - peer should send my feeling of counters – and then send back larger numbers - change in design.
   Yaron - Rate limiting does not solve problem - should solve at the transport and message level - not semantics of the payload – one option is to have a “failover counter” that's incremented on failover events.
   Dan - What is the protection against bad packets from an attacker – can this cause a resync
   R - Must be preceded by a failover event detection
   R - addressing rate limit w/ special Message ID 0

   Tero - Protocol should be split into two parts - IKE SA and IPsec SA counter syncs - different ideas of how they should be working.
   Problem IKE SA syncing - assume we missed some message - just need to sync counters - need to look at what those messages are really doing – need all changes to active SAs. Need text to say what happens for each different type of message. Easier to sync the SA counters when they occur.
   Paul - If two parts of cluster are not sync perfectly - the wrong picture on the failover server coming up. Risk is high and better to do a full re-key when states are wrong.
   Tero - don't need message id 0, just a normal IKE message and continue.
   R - Not possible to send a normal IKE message because of out of window
   T - Want to kill IKE SA if out of window. If two are out of sync - fall-back comes up - normal message ids - everything but a couple are out of sync - those few can be re-built from scratch.
   Yaron - quantitative question - DPD wants to happen a lot in some cases -IKE brittleness [dependence on exact counter sync] will be a problem and go out of sync - could be a high percentage of SAs out of sync.  If only for real traffic that causes a sync failure - then maybe don't need all of this work.
   Paul - ? - cluster member knows what type of syncing is occurring – if Tero style SA [few DPDs] then don't need to do this type of work. Based on the type of message then this is needed?
   Yaron - Yes -
   Paul - could make this a runtime situation decision.
   Tero - Why would you want to send lots of DPD message? - If no reason to be doing this then changes the issue for this document.
   Paul - Not relevant - cluster member has lots of things to look out.
   Tero - Protocol is useful if IKE messages much more frequently than sync messages. Otherwise this is not useful.
   Paul - Cluster member should have a feeling [re: traffic rates and composition].
   Tero - no just the IKE part is not needed.
   Paul - missed that - in splitting the IKE part can be removed
   Yaron - DPD issue is not a morality question - protocol allows for them, they will be aggressively used and thus we need to deal with this issue.
   Paul – aggressive DPD seen “in the wild” in existing products
   R - different peers can be sending DPDs as well. - including clients - this is a cluster - other messages that occur - could miss a rekey early than needed?
   Tero - no that does not help -
   R - arguing why IKEv2 part is needed - enables DPD - childless SA  - rekey may occur earlier and then try

   Paul - sense of the room -
   who read it - 10 people
   Splitting request - how many people think this makes sense - more people think spiting makes sense. Let's hold on the kill of IKEv2 until after the split discussion is finished.

   Tero - Does not solve all security problems - IKE part - sends biggest message id sent - forgets everything in the current window - sends biggest replay to date.
   R  - protocol does not say ignore all message
   T - doc says forget all message
   R - allows for rekey questions
   Yaron – security should not depend on higher level semantics - need to not allow for replay of messages - could make it a multiple (3 or 4) message exchange

   T - does not clarify what is going on - Peer thinks an on-going is processing and could get the replay prior to an SA response coming back (via different and slower routing)
   P - Please send to list - assume split occurs and send multiple sets.
   Yaron - separate security infrastructure
   Brian Weis - Solving replay problem is extremely important - w/ nonces will be requirement - how much do we save over the rekey if we go to multiple messages? - different between child SAs and IKE
   Yaron - if splitting then the multiple message game is just for the IKE SA part
   T - IPsec SA syncing part. - why we need to the maximum.  - goes away if message id 0 goes away - currently document says - when failure sends out counters - says this is the biggest counter I have seen to-date - Peer sends back to and may have a lower counter and thus replay back to the peer.
   R - not to have a crash counter because one more state is required to have and must be maintained by the peer as well as the cluster.
   T - solving replay is more important than making the optimization in the peer.
   P - design team has over optimized - may need to be removed
   T - Text saying counters are incremented [should say, “incremented by a large number”]
   P - wording issues
   T - if not large enough number incremented - then peer knows it is higher - thus a replay window has been opened.
   P - Section 8 needs to be re-written.  Was surprised by some of the text
in section 8 on first read

   Yaron - questions -=
       In favor of making it a working group draft - some hums
       no one against

Failure Detection Decision Process

   Raj - If we can detect the crash and do retransmissions - most of the job is in the gateways - crash detection can be important - offloads some work to the client or peer - helps load balancing w/ loose clusters – see applicability of this.
   Paul - do you see this valuable for which end of the cluster
   R - see value in both sides of the connection
   T - quick crash recovery is desired - but good implementations can do this - don't need protocol support for this - QCD does offer something - but SIR offers nothing.  Don't agree same MITM attack level in SIR as in normal IKE. Looking at slide 4 - none of these proposals addresses the second bullet item. Bob does not know anything about Alice anymore.

   Yoav - QCD and SIR have been in draft form for 3 years- The idea of “birth certificates” was mentioned ten years ago on the list but never materialized as a document. Crashing on clusters is not as big of an issue.  Could be an issue for single gateways however.
   R - Different types of clusters - load balance can be an issue as well.

   Yoav - w/ IKEv1 we could solve this in the protocol - setup the delete message and stored it on disk.  could then pull up message and send the delete messages when things looked weird.
   Would be needed because people are trying to do it.
   Basil- QCD would be better
   Yaron - does this imply it is needed
   Basil - yes I think so - cannot be addressed simply -

   Paul - short round on the list - but crash detection effort will probably die w/o much support showing up.


SEED in IPsec [short presentation on the new document].

Labeled IPsec - Jarrett Lu - new document will now be published.

        Paul - Please let us know at the start and the finish of this type of work so that we can do some review - but the discussion is not on the mailing list.

PAKE - Sean Turner
     Will go forward with PAKE as IETF stream Experimental [possibly multiple different solutions].

     Will push as it comes up
     Dan: - willing to start screaming bloody murder now [over the decision to drop PAKE from the working group]
     This working group is generically quiet - three different reactions to different documents
     EAP only got no comments - and is moving forward
     Liveness [crash detection] - concerted decision to cheer-lead
     PAKE - lack of reaction except from author - caused work item to be eliminated
     Three outcomes from basically same response of the working group
     Paul - did cheer-lead for PAKE as well.  will end up with similar results - correct on pushing some with too little discussion out the door – now considered a mistake.
     Sean - some may be naivety on my part - newness on the IESG.
     Paul - think PAKE is much bigger than the EAP mutual
     Tero - biggest PAKE problem is that most people not able to select between the different docs - need to have a single document to choose. Force the authors to pick.
     Paul - good response [by the authors] on the list - best response except for asking some to die.
     T - see if there is now a common single document that the authors can agree on. Think authors are currently the best ones to do the selection
     Paul - will see if this can work
     Yaron - think if one proposal can have more support?
     Paul - Question if have a single PAKE - will you spend more time on this document?  3 new people.
     Radia - Do we understand the IPR situation? Suspect that is some of the disinterest in this topic.
     Dave - tend to use strong secrets for authentication [in storage applications] - understand that may be stronger derived from weaker - will look at a single draft.
     Radia - Might be a good idea given the EKE patent might expire soon - that this might be the best way.
     Yaron - what if the authors say it is good [as far as IPR]
     Radia - not good enough unless some type of legal protection involved.

     Yoav - failure of the working group system if the WG cannot decide between different proposals. WG is no longer bringing value into the process.
     Paul - Does not mean anything on the experimental vs standards tagging.

     Imachin (?) - PAKE author - Think it might be better for the authors to present proposals at the next IETF.
     Paul - tired of wasting time and we have done that in the past and it did not seem to move forward.

     Yoav - how do we avoid this situation for the QCD vs SIR vs the virtual birth certs ???