2.7.3 Interim Meeting - Site Multihoming by IPv6 Intermediation (shim6)

NOTE: This charter is a snapshot of the 64th IETF Meeting in Vancouver, British Columbia Canada. It may now be out-of-date.

Last Modified: 2005-10-03

Chair(s):

Kurt Lindqvist <kurtis@kurtis.pp.se>
Geoff Huston <gih@apnic.net>

Internet Area Director(s):

Mark Townsley <townsley@cisco.com>
Margaret Wasserman <margaret@thingmagic.com>

Internet Area Advisor:

Margaret Wasserman <margaret@thingmagic.com>

Technical Advisor(s):

Thomas Narten <narten@us.ibm.com>
Dave Meyer <dmm@1-4-5.net>

Mailing Lists:

General Discussion: shim6@psg.com
To Subscribe: shim6-request@psg.com
Archive: http://ops.ietf.org/lists/shim6/

Description of Working Group:

For the purposes of redundancy, load sharing, operational policy or
cost, a site may be multi-homed, with the site's network having
connections to multiple IP service providers. The current Internet
routing infrastructure permits multi-homing using provider independent
addressing, and adapts to changes in the availability of these
connections. However if the site uses multiple provider-assigned
address prefixes for every host within the site, host application
associations cannot use alternate paths, such as for surviving the
changes or for creating new associations, when one or more of the
site's
address prefixes becomes unreachable. This working group will produce
specifications for an IPv6-based site multi-homing solution that
inserts a new sub-layer (shim) into the IP stack of end-system hosts.
It will enable hosts on multi-homed sites to use a set of provider-
assigned IP address prefixes and switch between them without upsetting
transport protocols or applications.

The work will be based on the architecture developed by the IETF multi6
working group. The shim6 working group is to complete the required
protocol developments and the architecture and security analysis of the
required protocols.

Requirements for the solution are:

o The approach must handle re-homing both existing communication and
being able to establish new communication when one or more of the
addresses is unreachable.

o IPv6 NAT devices are assumed not to exist, or not to present an
obstacle about which the shim6 solution needs to be concerned.

o Only IPv6 is considered.

o Changes in the addresses that are used below the shim will be
invisible to the upper layers, which will see a fixed address (termed
Upper Layer Identifier or ULID).

o ULIDs will be actual IP addresses, permitting existing applications
to continue to work unchanged, and permitting application referrals to
work, as long as the IP Addresses are available.

o The solution should assume ingress filtering may be applied at
network boundaries.

o The solution must allow the global routing system to scale even if
there is a very large number of multi-homed sites. This implies that
re-homing not be visible to the routing system.

o Compatibility will remain for existing mobility mechanisms. It will
be possible to use Mobile IPv6 on a node that also supports Shim6.
However, any optimizations or advanced configurations are out of
scope for shim6.

o The approach is to provide an optimized way to handle a static set of
addresses, while also providing a way to securely handle dynamic
changes in the set of addresses. The dynamic changes might be useful
for future combinations of multi-homing and IP mobility, but the
working group will not take on such mobility capabilities directly.

o The specifications must specifically refer to all applicable threats
and describe how they are handled, with the requirement being that the
resulting solution not introduce any threats that make the security any
less than in today's Internet.

The background documents to be considered by the WG include:

RFC 3582
draft-ietf-multi6-architecture-04.txt
draft-ietf-multi6-things-to-think-about-01.txt
draft-ietf-multi6-multihoming-threats-03.txt

The input documents that the WG will use as the basis for its design
are:

draft-huston-l3shim-arch-00.txt
draft-ietf-multi6-functional-dec-00.txt
draft-ietf-multi6-l3shim-00.txt
draft-ietf-multi6-failure-detection-00.txt
draft-ietf-multi6-hba-00.txt
draft-ietf-multi6-app-refer-00.txt

In addition to the network layer shim solution, the shim6 WG is
specifically chartered to work on:

o Solutions for site exit router selection that work when each ISP
uses ingress filtering, i.e. when the chosen site exit needs to
be related to the source address chosen by the host. This site
exit router selection and the associated address selection
process should work whether or not the peer site supports
the shim6 protocol.

o Solutions to establish new communications after an outage has
occurred that do not require shim support from the
non-multihomed end of the communication. The Working Group will
explore whether such solutions are also useful when both ends
support the shim.

o The possible impact of the use of multiple locators at both ends
on congestion control, traffic engineering, and QoS will be analysed
in conjunction with the Transport Area.

o The relationships between Upper Layer Identifiers (ULIDs)
and unique local addresses.

o ICMP error demuxing for locator failure discovery.

o If necessary, develop and specify formats and structure for:

- Cryptographically protected locators

- Carrying the flow label across the shim layer
defined in the multi6 architecture.

The shim6 WG is to publish, as standards track RFC's, specifications
with enough details to allow fully interoperable implementations.

Goals and Milestones:

Done		First draft of architectural document
Done		First draft of protocol document
Done		First draft on cryptographic locators, if required
Done		First draft on multi-homing triggers description
Done		First draft on applicability statement document
Oct 2005		WG last-call on architectural document
Oct 2005		WG last-call on applicability statement document
Feb 2006		WG last-call on protocol document
Feb 2006		WG last-call on cryptographic locators, if required
Feb 2006		Submit completed architectural document to IESG
Feb 2006		Submit applicability statement document to IESG
Apr 2006		WG last-call on multihoming triggers description
Apr 2006		Submit document on cryptographic locators to the IESG, if required
Apr 2006		Submit protocol document to the IESG
Jun 2006		Submit draft on multihoming triggers description to the IESG

Internet-Drafts:

draft-ietf-shim6-arch-00.txt

draft-ietf-shim6-app-refer-00.txt

draft-ietf-shim6-l3shim-00.txt

draft-ietf-shim6-failure-detection-02.txt

draft-ietf-shim6-applicability-00.txt

draft-ietf-shim6-hba-01.txt

draft-ietf-shim6-reach-detect-01.txt

draft-ietf-shim6-functional-dec-00.txt

draft-ietf-shim6-proto-02.txt

No Request For Comments

Interim Meeting Report

Minutes - SHIM6 Interim Working Group Meeting

SHIM6 Interim Working Group Meeting

Hotel Krasnapolsky, Amsterdam, The Netherlands
8th - 9th October 2005

The meeting logistics were generously supported by the RIPE NCC, and the SHIM6 co-chairs thank the RIPE NCC for their support.

In accordance with an announcement to the SHIM6 Working Group Mailing List, an interim meeting of the SHIM6 Working Group was held at the Hotel Krasnapolsky on the 8th and 9th October, 2005.

Agenda

The agenda for the meeting was as follows:

Review of current status
Presentation by lead authors on working documents:
Issue identification
Functional decomposition
Next steps (deliverables for Vancouver)
AOB

Participants

Minute Takers

Illitsch van Beijnum, Spencer Dawkins and Geoff Huston took notes of the meeting. The minutes were assembled by Geoff Huston

1. Review of Current Status

Protocol Specification
draft-ietf-shim6-proto-00

This is intended to be the core specification document for SHIM6. draft-ietf- shim6-l3shim-00.txt will not be further revised, and the introductory text will be moved to the shim6 architecture document. Section 18 ("Design Alternatives") of the -00 protocol draft will be placed into an appendix to the document. It is undecided at this point whether to keep this appendix in final version of the WG protocol specification document, or whether to publish the appendix as a separate informational document at the same time as the protocol specification document. The l3shim-00 document is effectively replaced by this document.

The lead author of this document, Erik Nordmark, has requested some assistance with the message diagrams and associated protocol interaction descriptions.

Functional Decomposition
draft-ietf- shim6-functional-dec-00

The questions relating to this document were relating to the specific purpose of maintaining this as a standalone document, and whether parts of this should be folded into the protocol specification and architecture document. The current version of the document concentrates of consideration of various design alternatives. At this stage it is proposed that the documentation of design alternatives and specific design decisions taken within the SHIM6 specification shall be included in the design alternatives appendix of the protocol specification document, and material related to the architectural description be folded into the architecture draft.

Hash Based Addresses (HBA)
draft-ietf-shim6-hba-00

This document is ready to WG Last Call, and is used by the protocol specification (which is based on HBA). The HBA draft describes how the hash algorithm works, and it is noted that we can WG Last Call the current draft and then bundle it with the rest of SHIM6 output for the IESG review and IETF Last Call. It was proposed to consult the Ads regarding cross-area review of this document needs, specifically including security community review.

Action: Chairs to perform a Working Group Last Call on the HBA Draft

Action: Chairs to refer the document to the ADs for cross-area review, with specific request for security review.

Ingress Filtering
draft-huitema-shim6- ingress-filtering-00

It was commented that ingress filtering is operational practice (BCP), not a particular protocol standard.

RFC2827 "Network Ingress Filtering: Defeating Denial of Service Attacks which employ IP Source Address Spoofing"

There are many ways that the issue of potential ingress filter-based packet drop based on a source address filter match at the site boundary interface could be addressed. It is also noted that the issue here is not entirely limited to packet discard through ingress filter, but that selection of a specific source address by the host may be used as a mechanism for a host to select a specific site egress path where there are multiple egress paths offering equivalent or overlapping destination reachability. The question of whether source address selection can be used for egress path selection is an open one, and the mechanisms proposed in the draft are not in common use at present.

The question considered was whether the draft contained material that was considered critical for the protocol draft. It was noted that the decision to use source address selection as the signalling mechanism between the host and the site's packet forwarding framework is not one that the WG has made. If it's a precondition for SHIM6 that source addresses matter, the WG needs to sanity-check this decision really soon, because the impact on routing and forwarding systems within sites is very significant. It is also noted that the sites' internal source-address forwarding mechanism is not required in all cases (if you have full BGP at a single site edge router, for example). Multihoming may not be comprehensive in terms of traffic "surviveability", and, for example active SHIM6 traffic may failover but default-routed non- SHIM6 traffic would not. If detection and repair aren't unidirectional, the "other end" gets hints that things need to change, but the new source address doesn't "steer" the other end traffic to take a particular incoming path. While it was considered that these concerns are relevant to SHIM6, they can be considered to be orthogonal to the protocol specification.

The question of adoption of this draft as a WG document was considered, It was noted that there are a lot of good ideas discussed in this draft, but its unsure that this document needs to be part of the SHIM6 document collection. The central SHIM6 concern is that of hosts selecting among site exit paths, rather than the general ingress filtering problem. One conventional approach would be to consider and possibly document the requirements first, and then solicit proposals that meet the requirements. It was also noted that the SHIM6 approach is to propose a minimal set of changes here. It was felt that more information would assist the WG in considering whether to adopt this document as a WG document.

Action: Marcelo Bagnulo to draft a matrix of scenarios and solutions relating to this ingress filtering and site exit path selection, intended for presentation at Vancouver.

Address Selection
draft- bagnulo-shim6-addr-selection-00

This draft notes shortcomings in RFC3484 in terms of trying all source/destination combinations, not just all destinations, in a multi- addressed local context, irrespective of local SHIM6 capability. The context in SHIM6 is the proposed operation of initial contact, where the current specification refers to RFC3484. It was considered that if the major difference between the current version of RFC3484 and this draft is the consideration of source address ordering, as distinct from just source address selection, then this would be a relatively small textual amendment to RFC3484. Implementing this change may be challenging, however. It was noted that this proposed change to RFC3484 would address the SHIM6 charter item that refers to "Solutions to establish new communications after an outage has occurred that do not require shim support from the non-multihomed end of the communication." The discussion in this draft will be used to support the case for proposing changes to RFC3484, but it was considered that as this was the output of the IPv6 WG, further changes to this specification should be considered by the IPv6 WG. Determination of this question was considered to be a decision for the ADs.

Design Note: The specification for initial contact will use RFC3484, as modified by this draft in terms of source address ordering.

Action: Marcelo Bagnulo to draft a note outlining a set of specific text changes to RFC 3484 to support the approach proposed in the address selection draft

Action: Chairs to pass this note and the address selection draft to the ADs, with the recommendation that changes to RFC3484 be considered by a WG in the Internet Area.

Application Referral
draft-ietf-shim6-app- refer-00

It was noted that this draft is not on the critical part in terms of completing the initial core protocol specification. It was noted that there a large number of issues with application referrals with Site-Local addresses, and this has the potential for similar characteristics. The draft notes that if a FQDN is not being used as the referral object, the options include the passing of a ULID as the referral object or passing the entire locator set as the referral object. The locator set object could be useful to assist in terms of initial contact, but this has some API implications for upper level protocols. The intent with the locator set was to use an object that it is not intended that applications understand the semantics of these locator sets.

Applicability
draft-ietf-shim6-applicability-00

Applicability document is currently a placeholder document, which will be revised once the core protocol specification has been stabilised.

Failure Detection and Reachability Exploration
draft-ietf- shim6-failure-detection-00 draft-ietf-shim6- reach-detect-00

As the topics of failure and reachability are very closely related the immediate steps are proposed to be merging these two drafts into a single document that describes the state conditions where a failure condition will be triggered, and also describes the exploration procedure that attempts to recover connectivity though a structured search of the locator pair space to discover locator pairs that offer host-to-host reachability. The decision whether to further fold this combined failure detection and reachability exploration specification into the core SHIM6 protocol document will be considered at the SHIM6 WG meeting at IETF-64 (November 2005).

It was noted that there is reference to IPv4 and RFC1918/NAT conditions in the current document, and it was proposed that this text be removed from the SHIM6 documents on the basis that the WG has no charter beyond specification of IPv6 mechanisms. However it was noted that the HIP WG and HIP RG may find the extensions of this approach in IPv4 and RFC1918/NAT contexts a useful approach useful in terms of avoiding reinvention in this area.

L3-SHIM
draft-ietf-shim6-l3shim-00

As this document is effectively replaced by the shim6-proto document, it will not be further revised by the WG.

Architecture
draft-ietf-shim6-arch-00

As discussed in the previous WG meeting in August 2005, this document will be revised once the core specification is stabilized.

Upper level Protocol API

A document describing the forms of interaction with upper layer protocols has yet to be drafted. Pekka Nikander has sent some thoughts to the mailing list on this topic, but these have not been submitted as a draft as yet. A January / February 2006 submission time is anticipated. The perspective proposed is to consider this from a SHIM6-centric perspective ("what signalling from the upper protocol layers would SHIM6 consider a helpful indication?"). The larger topic of locator pair selection as an indirect method of forms of host-based traffic engineering and performance / service quality selection would be held over for now and the initial API effort would concentrate on an API that supports signalling to a core SHIM6 specification. The WG considered that when a richer API was being considered by SHIM6 WG, then some assistance from TSV would be of a significant benefit to the consideration of this topic.

2. Presentations by Lead Authors on Working Documents
3. Issue Identification
4. Functional Decomposition

These three items were considered jointly by the WG.

Protocol Specification

Erik Nordmark

The specification has made a number of arbitrary design choices based on alternatives described in the prior l3shim document in order to make progress.
The approach of placing the context tag in flow label field was considered - the MTU problem isn't really a big problem, so why are we hacking in the flow label to solve it?
Do uncoordinated state removal with error message when peer removed/lost state is detected.
Use of Flow Label and modified Protocol numbers as an encoded SHIM6 header in the IPv6 packet header. New protocol number for control protocol (similar to ICMP), overloading of port number semantics for TCP, UDP, etc. Need to tell receiver "this packet needs special handling". Other protocols are also reusing flow label (NSIS, etc.).
Related topic - ordering of processing for multiple items in headers.
Is this a candidate for optimization (after we get the base protocol working)? If most conversations don't fail, we usually don't need this optimization anyway.
We're concerned about possible "flow label collisions" between protocol mechanisms that are trying to reuse the same header field.
We're concerned about bouncing back and forth between IP code and TCP code in some implementations. Why are we trying to avoid using a small number (8) of packet header bytes? Upper layer protocols already have to deal with changes in MTU size due to routing changes, etc.
The architectural intent for IPv6 is to do use a specific header for such signalling between endpoints. Can we optimize later? Would need to be a stronger case to optimize later. If we think people are going to use SHIM6 for something like traffic engineering, we'll do this more often than just in failure modes. Are we still using MPLS for traffic engineering, too? If so, we would have an end-to-end signaling mechanism to negotiate stuff like this anyway. Draft doesn't consider exhaustion of flow labels, using multiple flow labels, etc. which would add a lot more complexity. Could have an extension protocol that says "these upper-layer identifiers are being SHIM6ed", so packets don't have to carry any information at all - there's a whole capability negotiation mechanism that would be useful. Indications of "willingness to SHIM6" are unidirectional, so don't have to coordinate between endpoints. Can we start out with an explicit SHIM6 header and return to this later? Is this research now anyway? We could use the approach of an experimental extension protocol to deal with a number of things that we don't understand now.
Sense of the room is to use the explicit headers in the base protocol specification and go on. This also allows us to use a larger/better context identifier (we had picked 20 bits because it fits in the flow label, in the previous draft). It also means that SHIM6 control packets and data packets will tend to pass or be blocked by firewalls together (instead of passing control protocol packets and then failing data packets).

Design Note: Use an 8 byte IP SHIM6 header in the base protocol specification for packets that require specific SHIM6 processing by the receiver, and allow optimizations on this, including that of a zero-length header, to be an experimental protocol extension.

Do we need a checksum in the control packets? Certainly not in every SHIM6 packet, but checksums are more valuable in the control packets.

Design Note: Use a header checksum on the SHIM6 control packets but no in the SHIM6 packet header.

How "unique" are context tags? We are thinking about context tags as a way of resisting injection attacks, so longer is more attractive. We will end up with a 32-bit context field and about 15 bits as "reserved".

Design Note: Use a 32 bit context field with no checksum, and 15 reserved bits and a 1 bit flag to indicate control / payload. Note potential DOS risks

Failure in initial contact? The direction here is to refer to an updated 3484 (source address set), and may ask that a bidirectional ULP (such as TCP) retry multiple initial SYNs with various source/destination locators. Is this sufficient? One other possibility - "is this locator already part of another locator set that has been previously established?" And how do we cache this? Is this an API construct (ULP sends down the locator set and gets back a ULID from SHIM6 if a SHIM6 context already exists for this locator set)? Building on existing SHIM6 context state is similar to TCP sharing congestion information across TCP connections, so we can anticipate that SHIM6 API support at initial contact time would get used by upper layers.
What about exchanging SHIM6 information that allows SHIM6 peers to exchange ULIDs before initial SHIM6 context is established?

Design Note: For initial contact use RFC3484 (bis)

Design Note: Possible experimental ULP API extensions to initial contact:

Enhanced Contact would result in searching existing SHIM state based on initial locator set. (This may return a ULID pair that was not in the ULP's locator set - is this a problem?)
SHIM6 Contact would perform the contact step above, and where there was no SHIM6 context, then trigger SHIM6 state initialization and returned to the ULP the ULID pair with SHIM6 state set up
Ordered Locator Set ("getaddr_pair_set_info()") returns an RFC3484 ordering of locators based on local SHIM6 state information. This could be used to construct a connect_by_name() approach

We're not using ICMP messaging to indicate loss of state because of our concerns about ICMP filtering in today's Internet. If we use the same protocol for messaging and error notification, that protocol either makes it through firewalls or it doesn't, instead of failing during error notification, for instance.
Can a SHIM6 peer decline SOME of the locators offered by another SHIM6 peer? HIP experience was that a general update/ack protocol was useful to update locators, SPIs, etc. MOBIKE also has a similar underlying protocol as well. May include locators and preferences between locators.
How many offers/counteroffers do we need to support?

Design Note: For SHIM6 control messages use a unidirectional acknowledged information transfer UPDATE and ACK message transaction as the base protocol, and then specify control messages in terms of control message types and attributes.

Open Issue: Do we need a simply NOTIFY (un ACK'ed UPDATE) message type as well?

Placeholders in the specification:

CUD/FBD? Locator pair test/reply (Eric suggests that we drop this), and context explore messages.
Locator pair test/reply (need to be independent of ULIDs, and note that there may be multiple ULID pairs associated with the same host pair)
Reachability exploration for a working locator pair
What are privacy requirements for locator lists? Also integrity - this protocol is currently "in the clear".
"Forking the context state" for preferring different locator pairs for different ULP communications? How close is this to policy? Is this unidirectional or bidirectional? How much information does SHIM6 get from the ULPs that would go into this decision? This is similar to MONAMI6 work previously done. Preference for viewing this as a unidirectional rather than bidirectional search, although in a destination-based hop-by-hop forwarding environment without source-address routing considerations a pair of source- address locators in each direction is functionally equivalent to a single bidirectional pair.

Design Note: Locator pairs are considered as unidirectional locator pairs, and there is no assumption that these must map into a bi-directional pair.

Locator list option has all locators, but HBA parameter set has prefixes that reflect all locators, too. Is this needed in both places?
Do we need a generation/version number for the locator list? This isn't the same as transport sequence numbers that are used for reliability. Could we recover more rapidly if we know what version number we are current with? But we're sending entire sets now, not sending deltas. Don't want to list entire IPv6 address locators in order to change preferences? But it is simpler to send the addresses than to send the addresses and then send preferences by index.

Design Note: Do not use locator ordering and index references in SHIM6 control messages in the initial base spec

Detecting loss of context doesn't work while the ULID pair works as the locator pair, so the peer may have garbage-collected the context and you didn't notice until there is a failure.
What do you do if you receive contexts that you don't understand? Send an error, if it's a control packet, or silently discard it, if it's a data packet? Eventually you notice because of reachability detection anyway - do we need to notice more quickly than that?

Design Note: We need to indicate which LLU locators should be verified with HBA, CGA, or some future mechanism.

32-bit contexts could be DOSed - do we need more bits?
Which SHIM6 control messages need sequence numbers?
Remaining Design Alternatives...
Need to make a decision on state cleanup, choosing uncoordinated cleanup.

Sharing base packet format with HIP for SHIM6 Control messages

Pekka Nikander

One perspective on HIP and SHIM6 is that SHIM6 is a semantic subset of the HIP approach (Assertion - SHIM6 is a subset of the problems HIP is trying to solve)
This is not thinking about "same state machine and same semantics".
But a common packet format would help with areas of potential experimental protocol extension
Current HIP packet layout is pretty different from SHIM6 packet layout, but (ignoring HITs) the contents are pretty similar.
Option format - is 256 bytes enough for CGA signatures? If not, we have 16-bit length, so having 16-bit type makes more sense, and we may end up with something that is a close approximation of the HIP parameter format in any case
Not proposing a single shared parameter space until we know a lot more about HIP than we know today.
Why did we use 8-bit options, and would 16-bit options be a problem?
Our biggest expressed concern was a perception problem that SHIM6 is
ntending to generate a proposed standard, while HIP is experimental. That would imply a position that any resemblance to the current HIP packet format is entirely coincidental, but useful in various experimental contexts.

Design Note: The SHIM6 packet formats have been updated to

have a 32 bit context tag

checksum in same place as in the hip header

a 1/0 bit to distinguish the payload vs. control messages

have a 16 bit option type and length

For the most control messages this results in 7+16 reserved bits. Most of the fields are 32 bit so they can't fit in here.

Adopted HIP parameter format for options; HIP parameter format defines length in bytes but guarantees 64-bit alignment.

[Meeting adjourned for dinner, restarted on Sunday morning.]

Protocol Specification Placeholders (review)

Locator pair test and response

Design Note: Proposed to drop specific mechanism for locator test and response

Reachability exploration: what locator pairs are working after a failure? (actually "find me the first locator pair that works") - refer to the failure and reachability work.
Locator list option has all locators, but HBA parameter set has prefixes that reflect all locators, too. Is this needed in both places?: We think it's OK to duplicate locators and prefixes in our messages.

Design Note: Allow both locator set enumeration and HBA parameter set in an UPDATE message

What are privacy requirements for locator lists? Also integrity - this protocol is currently "in the clear".

Design Note: Place this topic into the larger item of possible areas of protocol extension, and note in the Security Considerations of the protocol specification that "we have considered this and are advising that this falls into an area of potential protocol extension activity.

Action: Pekka Nikander and Marcelo Bagnulo to work in a draft of "Guidelines for potential protocol extensions for SHIM6, including (but not limited to)

flow label use / header compression,

privacy,

hash chains and security,

initial contactless SHIM6 context establishment ,

API interaction for initial contactless SHIM6 context establishment

Locator pair selection based on signalled preferences

Return path locator preference signalling

Forking of context state - is this unidirectional or bidirectional? Strong preference voiced for a unidirectional forked state. Two goals here - traffic engineering for a site, and different traffic types between the same two hosts. Traffic engineering seems closer to what we know about at IP level - "different traffic" may be a lot more open-ended. Mobile IP and HIP have similar issues. One proposal advanced was to schedule a joint working session in IETF-65with TSV and RTG? We won't know enough to meet on this subject in IETF-64 in November 2005. Can we require that this be done at ULID selection time? SCTP has similar problems (but SCTP is closer to the application than SHIM6). SHIM6 is providing a hook for something finer than host-to-host granularity, without trying to solve all conceivable problems. Bidirectional context state forking is seen as a ULP signalled outcome.

Design Note: View forking as a unidirectional context state fork (based on a ULP signal) that assumes that the forked context state may then use a different outgoing locator pair.

Run with a version number for a locator set?
Detecting context problems while the original ULID pair works as a locator pair? Need to detect the problem before a failure happens. Ping periodically? If we send R1 as a context error message, we're already starting to re-establish the context state. Why would any host that was SHIMming decide to stop doing so? We need to make sure that we don't require continuing packet exchanges without advancing to context established state. The R1 values are slightly different (we don't have an initiator context tag from a request, we are using the context tag that we believe the peer thinks we have). We think that trying to return to non-SHIMmed operation when a host garbage-collects context is probably a mistake - we'll just "die".
What happens if the A end garbage-collect its state and later reuses the same context number with the same B end host? Should the B end have the new context replace old B end context state and just go on? There is a race condition if the remote end is trying to reestablish the context that has already been locally garbage-collected, and the remote end is trying to send using the old context. There's a concern with forged packets that try to reestablish the context resulting in a DOS. Can we include in the context tag generation algorithm some bits from the sender of the packet, as well as the receiver of the packet who chooses (most of) the context tag value, so the context tag has bits from both ends and we can tell context 3.1 from context 3.2? Context numbers that are pseudo-random would help, but we can't prevent collisions completely. If applications can provide hints to SHIM6 that the application is still alive ("so don't garbage-collect"), that would help. A usage counter can tell you if garbage collection of the context state at this point in time would be a bad idea (as its still active), but not if it's a good idea. If we can get unwedged, that's the important thing - being wedged less often is an optimization.
What do you do if you receive contexts that you don't understand? Send an error, if it's a control packet, or silently discard it, if it's a data packet? Eventually you notice because of reachability detection anyway - do we need to notice more quickly than that?

Design Note: On receipt of a SHIM6 payload packet where there is no current SHIM6 context at the receiver, the receiver is to respond with an R1* packet in order to re-establish SHIM6 context. The R1* packet differs from the R1 packet in that an R1 packet echoes the I1 fields, while this R1* offers state back to the sender. Either way the next control packet is an I2 in response. The sender's previous context state is to be flushed in receipt of the R2 packet following the R1*, I2 exchange

Action Item: Marcelo Bagnulo to review this and consider possible issues with this form of SHIM6 protocol response.

Action Item: Erik Nordmark to document the alternative SHIM6 context setup where each side offers one half of the constext value, so that unnecessary context destruction is avoided for WG consideration.

Are four packets really necessary in the SHIM6 context establishment? IKEv2 doesn't require cookie to be present in all packets, only when we suspect we're under attack. But this could be an experimental extension. SYN flooding is still incredibly difficult to deal with operationally (because each packet is just a normal packet). We are in better shape than IKEv2 because packets are still flowing "normally" while we are setting up SHIM6 context. This could be a potential experimental protocol extension.

Action Item: Marcelo Bagnulo to document a shorter context establishment protocol exchange based on the IKEv2 approach (as a potential experimental protocol extension).

Which SHIM6 control messages need sequence numbers?

Design Note: SHIM6 control message sequence numbers are not needed here.

Reachability and Failure Detection

Jari Akko
Iljitcsh van Beijnum

Failure

"Primary" isn't quite the right term (it's mostly "the locator we started with")
We won't reinvent DHCP, and we will believe what ND tells us.
SHIM6 is only expected to be used in failover scenarios. Shim6 only works as a failover
i.e. different hosts may have different locator sets for the same remote host
i.e. a pair of communicating hosts can have multiple contexts with independent locator sets.
Right now the 'hint' is the ULID pair differentiation
Different contexts do not necessarily imply different ULID pairs
FBD is chosen for simplicity

Design Note: Use FBD as the reachability algorithm.

Sender chooses outgoing address pair (independently of the choices made by the remote host)
Failure Detection:
1. If you receive anything when you are sending packets, assume that all is well.
2. If you aren't sending or receiving packets, assume that all is well.
3. If you are receiving packets and don't need to send payload packets, send some form of keepalive.
4. If you are sending payload packets and aren't receiving anything (payload or keepalives), assume that something is broken after time interval T.
We need a time base in order to send keepalives, and an associated timebase for non reception of in-coming packets within the SHIM6 context.
Peers need to have a shared understanding of how long this timeslot is. We need to understand the relationship between timeslots and RTTs (and need to keep from reinventing TCP within SHIM6 with focus on RTTs). Would prefer not to initiate an exhaustive locator exploration just because SHIM6 is confused about the peer's timeslot choice. We need to think about how aggressive we want to be about failure detection. Exploring this futher, it was observed that 10 seconds is fast as compared to BGP4 current practice (1.5-3 minutes). There is a startup transient that is also critical here. Should the initial specification used a statically defined time interval, or does SHIM6 adaptively learn? Is there a difference between symmetric idle and assymetric idle? We have some concerns about interaction with higher-level protocols that may also be trying to do recovery asynchronously (and applications that may differ in the goal for failover). Should this detection and recovery mechanism be faster than a TCP ULP? Should the routing state timers in OSPF and BGP be a factor here. TCP timeout is an upper limit. Within that constraint, we have three choices for the timeout: slower than BGP (so we give BGP the chance to repair the failure: > 90 (RFC) or 180 (Cisco) seconds), between BGP and OSPF (give OSPF the chance to repair: > 40 < 90 or 180 seconds) or faster than OSPF: < 40 seconds.

Design Note: Use a statically specified in the initial protocol specification of (10) seconds.

The idle keepalive trigger is statically specified to be 3 seconds. This value may be negotiated at SHIM6 context startup as an experimental protocol extension
This value may be dynamically altered during the SHIM6 context as an experimental extension

The meeting noted other candidate timers, including setting the value between 24 and 36 seconds.

Reachability Exploration

Exploration may be a uni-directional discovery, but a bi-directional shared computation
Exploration uses an attempt to synchronize on a state, using a format where each sent probe carries information relating to all received probes so far.
Exploration also makes use of timers in terms of assumptions of failed probes
In exploration for a viable locator pair it is noted that only one end may know there's a problem, and knowing when to STOP exploring is really hard.
Note FBD only detects failures in the incoming path.
Consideration of the use of a "quick check" as the initial response before launching into a full exploration.
Must SHIM6 recognize a keepalive as a keepalive? This is not strictly required in FBD, as its SHIM6 packets rather than specific packet content or type that count here, but we have to be able to recognize keepalives as keepalives to avoid sending keepalives in response to keepalives.
Also note that it's an issue to determine when to STOP sending keepalives when neither peer has traffic to send.
It is also a relevant consideration of how firewalls react to keepalives (probably react badly to IP packets with no payload whatsoever (header only), probably use SHIM header with keepalive option.
The concept of a "host-id" was considered as a way of identifying a "host" across multiple ULIDs. Need an algorithm to make sure all hosts choose a unique host ID (same theory as router IDs). How do we change host IDs if the chosen locator was deprecated? Alternative is to work with sets of locators (instead of host IDs) - "this is the set of locators I think you have".
How dynamic are locators sets (with CGA, etc.)? ULIDs don't change as long as there is any session active.
A common probe data structure is proposed to be reused in several packet layouts.
Quick check request/reply mechanism (we think this path will work, we're just making sure), plus full exploration with context reference. Some concerns about DOSability of including context information as part of reachability (reflection attacks, etc.). Including "last few" probes in each direction (allows you to detect relatively slow locator paths). Does this extend into sending complete metrics for all locators on each probe? Could send last 3 successful probes, last 3 failed probes, etc. Balancing amount of hints on path selection with amount of information sent. Unsuccessful probe information could be really useful ("move these locators to the end of the list"). 30-40 second-old information is ancient history.
Start full exploration when you timeout on quick check. "Exponential backoff" - sending probes to more locators over greater intervals. Some discussion about choosing "best" or "first as good as previous" paths based on RTT vs simply choosing a working path - concerns about minimizing RTT versus other QoS values (jitter, etc.). Moving beyond "works" as a discriminator should be an experimental protocol extension. Moving "back" to the original locator pair should be an experimental protocol extension. The subtext is "getting off my GPRS backup ASAP", and that's really hard to generalize.
Propose to use this probe structure in all SHIM6 packets (would give us better RTT measurements)?
When should exploration stop? When you have any candidate locator pair? Or continue to see if there is a 'better' candidate pair?

Design Note: Continued exploration to see if a 'better' locator pair is available following identification of a viable locator is considered to be an experimental protocol extension. The exploration in the base protocol specification will terminate once a viable 9reachability confirmed) locator pair has been discovered.

Reachability (v2)

Reachability, version 2 (REAP) - using the same messages for four complimentary functions (direct reachability, reverse-path reachability, checking different return paths, return routability checking). Including a mechanism for not having to probe in both directions simultaneously.
How long do we continue to probe? Keep context state around and wait for upper layer apps to "try again". We may wish to remember paths that previously failed so we try them LAST during the next failure.
Some discussion of how close REAP comes to being STUN protocol when we have unidirectional path failures... is this ALSO an experimental extension?
Some discussion of how long we have to deliver SHIM6 baseline functionality (time and energy), and how our charter maps to "providing IPv4- equivalent multihoming in IPv6".
Some discussion of unidirectional path failures - isn't this usually due to ingress filtering? Can we assume that we don't have to recover from unidirectional path failures in SHIM6? But we have to detect this condition, even if we don't recover from it. We think we can assemble two unidirectional paths into a single bidirectional path, but don't know all the implications. We need to be able to steer traffic based on source addresses to accomplish this. Should continue exploration until we identify a working bidirectional value. RFC 3484bis and SHIM6 setup and recovery will all require bidirectional paths - don't SHIM6 setup over unidirectional paths (because RFC 3484bis has a working path already, no reason to try to improve it in the face of failures). Should we allow setting up a SHIM before there is a context? Need I1bis that says "send this back using a different locator" to make this work. Concern about creating state? but maybe allowing state on the INITIATOR is OK, as long as we don't require state on the RESPONDER. May have to do M x N scan to find something that works. May need "API on steroids" to initiate this process.
REAP - Four functions:
Another way to look at exploration is that of an exploration of a matrix of locator pairs, with sender using locator Pair (a,b) and the receiver using the locator pair (a',b'), and each cell of the matrix is itself a 3x3 set of possible information states, whether traffic has been sent on the sending locator pair, whether traffic has been received on this pair and what the local (sender) end thinks it knows about the other (receiver) end.
Related issues: how many probes before the algorithm can consider the locator pair unuseable? How many passes across the full exploration space before the algorithm terminates with complete failure?

Design Note: Where we are:

Initial contact: 3484 (bis) is bidirectional
Shim setup is bidirectional based on initial locator set
Recovery from failure is potentially unidirectional

Design Note: Questions:

Should shim setup allow unidirectional? No point per se unless you have shim6 setup WITHOUT context

Should 3484 be extended to allow unidirectional? No

Should shim6 be allowed to setup without initial established context? If yes, should it include unidirectional discovery?

Should failure recovery continue to see if there is a bidirectional locator pair even though there was already a unidirectional path

Note that the modified I1 in this case would include a STUN-like request to pass a packet back with a different locator pair than the received I1 packet

Action: Document the simple cases

Action: Document this concept of shim6 context without initial bidirectional initial contact (i.e. shim6 initial context passes into an initial walkthrough) and API considerations

Action: Initial reachability detection - aim to get a unidirectional support version drafted by November. Pekka Nikander and Iljitsch van Beijnum to do an individual submission for working group review.

5. Next Steps

Produce -01 of the protocol draft immediately following this interim WG meeting
Perform a WG Last call of the HBA draft
Reachability detection - aim to get a unidirectional support version drafted by Vancouver
3484bis - pass documented requirement to ADs to see where it's actually reviewed.
There may be impacts on the HBA draft if there is an option structure added to CGA. We think Marcelo can handle this, before IETF-64.
Want to make sure everyone agrees that the base protocol extension represents a conservative, but workable approach and at this stage consider further refinements to be experimental extensions. Check with the working group to see if unidirectional support goes in the base protocol? (Does the WG agree that STUN-like response redirection is a good thing to include in the base protocol? We're saying that the API must allow applications to specify ULIDs, we're saying that I and R packets must support response redirection - maybe this is a reply redirection AND a reply that says what the original addresses were.

Slides

Shim6 protocol
The Marketing Story
Same packet format for HIP and SHIM6?
Reachability and Failure Detection
(un)reachability detection