IFARE BoF minutes
Wed March 21, 2007
Scribe: Paul Hoffman, taken from the MP3

Agenda: requirements for the WG
	Started with the requirements doc, went elsewhere
Non-agenda: protocols
	Those will be discussed at a future meeting

Overview of requirements
	Question: is the relationship with the MIP cloud appropriate? Yes
	How we got here
	Operators didn't want to use IPsec
		Takes too long to set up the IPsec SA after reboot or network failure
		Wants different home agents after network failure
		Didn't want static keys: too much overhead for sharing everywhere
	We want a better security solution than what MIPv6 has chosen
	This could be used in other places; MIPv6 is just the trigger point
	Had a dinner in San Diego with IPsec folks
		More use of IPsec if there was an interoperable failover protocol
		Got us going on the problem statement

Charter discussion
	Question: not clear on the problem that needs to be solved
	Feeling in the room that we should be talking about the problem statement first

Problem statement document
	Yaron Sheffer started, lots of questions ensued
	Many people participated in the document
	Main issue: re-establishing IPsec SAs for peers who had one established earlier
		Could be with same gateway or different gateway
		Reasons for needing to re-establish
			Network failure
			Gateway goes away but comes back
			Gateway goes away and doesn't come back
			MIP application gateway failing
		Re-establishment expensive
			Diffie-Hellman and other expensive operations
			Lots of messages back and forth for EAP
			User might be required to re-authenticate
		MIP is the main motivation, but also normal IPsec VPN uses
			Particularly for MIP home agent
			Need to fail-over the MIP signalling and the IPsec protecting it
		Proprietary IPsec failover solutions exist
		Proprietary MIP failover solutions exist
		Mostly talking tunnel mode
	Question about number of clients to a typical gateway
		Tens of thousands
		But hardware crypto acceleration can get rid all crypto speed issues
		We can take crypto speed off the table: most minor of the issues
	If MIP can do all assignment in one round trip, but IKE takes eight more, that's unacceptable
	It takes minutes for the client to see that they gateway is dead
		Therefore adding a bunch of round trips isn't so bad
	Issue of all clients trying to re-establish at the same time
	There is an alternate solution that is not secure
		That might be enough reason to do this securely
	If the client powers off and on, and goes to a new HA, it needs to re-establish
	Question: is failing over a client after reboot consistent with RFC 2301?
	Question: in MIP6, if I fail over to a new HA, don't I need a new SA?
		Not if the HAs are in the same security domain
	Might need to renegotiate Phase 2 and/or EAP, might not
	At the end, Sam needs a trust model that he can evaluate
		Particularly true if you can fail over to a different gateway
	Many IPsec vendors have proprietary solutions in this space
		There may be IPR issues; we must look at that
		Already allow for client-transparent failovers of IPsec
	If the MIP client needs to move to a new home agent during failover, it cannot be transparent
		The SA needs to be tied to the home agent when moving
		If we make the solution transparent-only, it will limit this too much
	Request to have the client failing-over be out of scope
	Need to differentiate between clusters of servers and geographically separated servers
	The trust model is that the failed-to gets a secure assurance from the failed-from
	If the failover is transparent to the client, there is no work to be done in the BoF
		Need to be clearer if transparency is needed / prohibited / allowed
	Need to be clear on which resources would be optimized by the requirements
	Waking up the client after a failure is a big problem.
		MIP6 WG has a failover draft now
		Not a problem in MIPv4 because they don't use IPsec
		Therefore the failover should not involve the client
	Why is this an interesting problem to solve given the perceived length of time for failure detection?
	The dormancy ratio is around 90%
		Important to what we are trying to solve here
	3G is using IPsec
		GAN
		Wireless interworking
	Need to size the network for failover because of quick startup
		Accelerator cards are the cheapest part of a large network box
	Goal of geographic distribution of gateways
		May have different internal and external IP addresses
		May have different routes
	Goal of low-latency failover
	Need to support application use
		Might need to carry application data in state blobs
	Need to have interop between client and gateway
		Open issue is how much interop between gateways
			Nothing, just SAs, full interop
	Want to support stateless and stateful failover
		Stateless: all state is maintained on the client
		Stateful: state is kept in the network without the client having to remember
		Not meant to be transparent to the client
		Transparency is not a requirement, but it might become a possibility
	Need to support both transport mode and tunnel mode in IPsec

Open issue discussion
	Client involvement or noninvolvement
		Many people have solutions with non-involvement today
		Non-involvement prevents geographically distributed gateways
		For MIP, there are some solutions with standbys that are geographically distributed
	Must the HA be on the same IP?
		For MIP and MIPv6, the HA must have the same external IP address
	Transparent to the client or not
		Proposal to not address transparent because there are lots of proprietary solutions already
		Proposal to focus on distributed site with different IP address scenario
			Prohibitive to use gateway-to-gateway syncing across large areas
			Work on client-to-gateway interop first, then later do gateway-to-gateway
	Client involvement is always required even if user requirement is not
		Without client involvement, cannot keep all of IPsec SA state such as counters
			Unless the two gateways are next to each other talking quickly all the time.
			Probably can never interoperate because different gateway vendors do different things inside
		Doesn't think that it is the gateways that will tell the clients that it rebooted
			Always the client who will notice the gateway is gone or has rebooted
			Clients will be spread out in time as they try to send things to the gateway
	Is there a need for perfect forward secrecy? Can we give it up?
		It could be a goal but needs to be balanced against other goals
	Proposal that the "machine goes down" for local failover scenario is a non-issue
	Proposal that we can't move enough state back and forth to be secure
	Question about MIPv6 with geographic redundancy
		Is there high availability behind them?
		Proposal that it is better to be able to fail over to another active gateway than a passive one
	Need to decide which state we are failing over to: Phase 1, Phase 2, application
	Need to have interop between different servers to different clients
	Failover has to happen in much faster than five minutes
	Proposal: not needing accelerator cards reduces cost of boxes used as these gateways
	Proposal: only interesting when trying to maintain real-time communication
	Proposal: geographically distributed gateways with rapid reconnect needing client/server interop
		Do the gateways need be able to be from different vendors?
	Problem: relation to MOBIKE
		Want to have it work with MOBIKE if moving at the same time as failure
		But MOBIKE only works with tunnel mode now
	Gregory Lebovitz put up a slide to help
		Fixing just the left side failure is solved today in non-interoperable ways
		Only focusing on being able to fail to a different gateway site
	Proposal: also being able to fail over to original gateway coming back up
	Question: is there a fast link between gateways? Different design than if there is
	Question: is this for gateways or for home agents?
		The motivation is a HA that supports IKEv2
		In MIPv6, IPsec only protects the control traffic, not the data
		Solution should be able to cover data traffic as well
	How the solution interacts with the application is out of scope
		We will provide guidance for applications that can use this
		Sam says that this approach for MIPv6: need closely-related triggers for signalling
		Disagreement from an implementor

Consensus calls
	Led by Sam
	People want to work on cross-gateway-vendor solutions
	There was disagreement about whether or not inter-vendor could work
	Need to be clearer about whether this is gateway-client interop or gateway-gateway interop
	A few people say that if the client is involved, gateway-to-gateway interop is possible
	RFC 4301 has a minimum set of things that all vendors need to do
		Therefore we might be able to move them around
	People want to work on cross-gateway-vendor solutions that preserves IPsec SA state
	Question: what value are we adding by fast failover?
		This is getting faster over time, so picking a number now is wrong
		Some apps require faster recovery than others, regardless of whether they run under IPsec
		Request: don't make IPsec the stumbling block
	We need to have MIPv6 failover people in the discussion
	We need to make the scope clearer
	List will discuss a statement about in-scope, and a statement of out-of-scope
	Need to get the MIPv6 WG to agree that our work is interesting and are willing to work with it.
		If not, we need to restart from a different charter.
	We need to state the use-cases better to get more focus
	We will have a second BoF in Chicago regardless