Routing Area Working Group (rtgwg) Charter

2.5.11 Routing Area Working Group (rtgwg)

NOTE: This charter is a snapshot of the 61st IETF Meeting in Washington, DC USA. It may now be out-of-date.

In addition to this official charter maintained by the IETF Secretariat, there is additional information about this working group on the Web at:

Additional RTGWG Web Page

Last Modified: 2004-10-14

Chair(s):

Bill Fenner <fenner@research.att.com>
Alex Zinin <zinin@psg.com>

Routing Area Director(s):

Bill Fenner <fenner@research.att.com>
Alex Zinin <zinin@psg.com>

Routing Area Advisor:

Alex Zinin <zinin@psg.com>

Mailing Lists:

General Discussion: rtgwg@ietf.org
To Subscribe: rtgwg-request@ietf.org
In Body: subscribe email_address
Archive: http://www.ietf.org/mail-archive/web/rtgwg/index.html

Description of Working Group:

The Routing area receives occasional proposals for the development and
publication of RFCs dealing with routing topics, but for which the
required work does not rise to the level where a new working group is
justified, yet the topic does not fit with an existing working group,
and a single BOF would not provide the time to ensure a mature
proposal. The rtgwg will serve as the forum for developing these types
of proposals.

The rtgwg mailing list will be used to discuss the proposals as they
arise. The working group will meet if there are one or more active
proposals that require discussion.

The working group milestones will be updated as needed to reflect the
proposals currently being worked on and the target dates for their
completion. New milestones will be first reviewed by the IESG. The
working group will be on-going as long as the ADs believe it serves a
useful purpose.

Goals and Milestones:

Done		Submit draft on calculation of IGP routes over TE tunnels to IESG for publication as Informational RFC
Done		Submit initial Internet Draft on IP Fast Reroute Framework
Jun 04		Submit initial Internet Draft on Basic IP Fast Reroute mechanism
Aug 04		Review various mechanisms for Advanced IP Fast Reroute
Oct 04		Submit IP Fast Reroute Framework to IESG for publication as Informational RFC
Oct 04		Submit specification on Basic IP Fast Reroute mechanism to IESG for publication as Proposed Standard
Nov 04		Select the Advanced IP Fast Reroute mechanism
May 05		Submit specification on Advanced IP Fast Reroute mechanism to IESG for publication as Proposed Standard

Internet-Drafts:

draft-ietf-rtgwg-rfc3682bis-04.txt

draft-ietf-rtgwg-ipfrr-framework-02.txt

draft-ietf-rtgwg-ipfrr-spec-base-01.txt

Request For Comments:

RFC	Status	Title
RFC3682	E	The Generalized TTL Security Mechanism (GTSM)
RFC3906	I	Calculating IGP Routes Over Traffic Engineering Tunnels

Current Meeting Report

RTGWG IETF 61
-------------

1. Agenda bashing, aministrivia (chairs) [5m] 00:05
2. Document status (chairs) [5m] 00:10

RFC 3906 published (informational)

GTSM -- more comments need to be integrated, last call before Minneapolis. Implementors please inform mailing list/authors

Framework, loopfree, MIB well along.

uloop prevention design team constituted (names already sent to list). Desire to keep membership small (already not small, so maybe "less big"). Goal total coverage if possible, extensible if not. Design team to report back by December '04.

3. Basic IP FRR spec update (Alia) [15] 00:25

Document revved to be more of a spec and less of a survey. Need to read framework too because that's where definitions section is!

To do: Multihomed prefixes, link selection, SRLG.

Need more people to read & comment. Comments to list please.

4. IPFRR MIB (Alia) [20m] 00:45

draft-atlas-ip-local-protect-loopfree-00.txt
Only the first of many MIBs.
Doesn't cover SRLGs (yet?)
Includes protected route table (with NH, alternate NH including alt NH type)
Includes unprotected route table (just route and why)
Global routing stats (various kinds of route counts)
Interface table.
Not covered: IGP (IPFRR enabled? local holddown time?), LDP (protected/unprotected FECs, alt NH info including alt label). Other (small) MIBs will probably be needed for these.
Please comment on: is this grouping of MIBs appropriate?

Alex Zinin: re protected/unprotected route tables, why use a different table instead of augmenting an existing table?
Alia: I don't know how to do that, my understanding is you can't really extend a MIB, this one is indexed the same as an IP routing table MIB which I think is as good as it gets.
Alex: so how do I use these tables?
Joel Halpern: please remember that a MIB is a MIB, it's just used for management purposes, it doesn't drive the implementation.
Alex: Are these different sets of routes, will it be recorded twice, once in normal routing table and once in unprotected table?
Alia: Yep.
Bill Fenner: I will sometimes admit to being MIB-literate. This is the right thing to do. Indexes are dup'd but info isn't.
Stewart Bryant: How do we report dynamic info like "repair attempted but failed"? No doubt there will be other dynamic info.
Alia: Q is what level to detect, what level to report at. Probably will be in IGP MIB and not this one.
Stewart: I think this is really important for O&M, because these faults are transient so we must be very attentive to this issue.
Alia: Yep. We need to make sure that we can actually detect the errors we put in the MIB!
Stewart: Need to go to ipfix? Maybe doesn't even need to be in MIB.
Alia: We should talk about it.
Stewart: We'll try to write a draft up about it.
Don Fedyk: We did consider that. Error reason is in there but there's no history associated with it. Take a look at what we have and see what needs to be improved on.
Stewart: An example of what I'm talking about is we think we have a protection path but when we try to send a packet on it, it fails.

Is MIB grouping sensible, are MIBs sensible, please read and comment or you will get what you deserve? Right now draft has u-turn alternates in it, should it include other candidate alternate types?

Stewart: First MIB should include basic, where there is common ground, then have a different MIB for advanced.
Alia: All I mean is that there is type defined for "u-turn" for alternate type, and a row in interface for "can I break u-turns".
Alex: Maybe we should just rename u-turn to "reserved"?

Comments to list please. Very few admit to having read it.

Alex: We'll ask on the list about making draft a WG doc.
David Ward: Who will do IGP MIBs?
Alia: Are you volunteering?
David: No. Someone from this WG should do the work and then present it to the IGP WGs.
Alia: Yep.

5. Micro-loop prevent DT report (Alia, Mike) [20m] 01:05
Discussion [20m] 01:25

draft-bryant-shand-lf-conv-frmwk-00.txt
draft-zinin-microloop-analysis-00.txt
Mike presenting.

Trying to bring order to chaos, we have too many partial solutions right now. Trying to explain, divide solution space into types, consider types, summarize.

Basic problem: Microloops resulting from conventional IGP converge-as-fast-as-you-can loses traffic, undoing IPFRR goodness.

Reason for uloops: Independent/asynchronous decisions. Loops are temporary! Duration can be much longer than IPFRR time though. Duration driven by relative time to update FIBs (i.e., degree of asynchrony). No way to guarantee two routers will take similar length of time to update FIBs (from one router's PoV the network change may cause just a few routes to change -> fast download, from another PoV many routes may change -> slow download).

Solution: Controlled convergence. Inevitably makes convergence slower, but this is OK because IPFRR repair covers failure allowing leisurely convergence. But: still want to keep traditional method as fallback in case of multiple failures.

Solution taxonomy:
- Controlled information flow (incremental cost change)
- Controlled distributed behavior (synchronized FIB installation, ordered FIB changes, path locking)

(See slides for full comparison matrix, highlights follow)
- Incremental cost change -- can take hours
- Synchronized FIB install -- seems simple, but isn't, and dependency on NTP
- ordered spf's. no changes in forwarding plane. doesn't deal with SRLG (only single failure is supported). Need to extend algo to a per-destination base. Long delays if large network diameter. Worst case can be pretty long...
- path locking. cons: complete coverage requires additional forwarding mechanisms. pros: small delay in rib/fib installation.

Detailed description of the above four methods

Ordering by signalling
Alex: Is node failure a SRLG case?
Mike, Alia: No. Node failure can be handled by any of these techniques.

Ordering by delay
"Lollipop topology" (for example) can make delay-ordered SPF slower than needed (known techniques are more pessimistic than needed).

Can combine delay and signalling (optimization of delay-based version, point is that signalling doesn't need to be reliable since delay backs it up)

Backwards compatibility is a problem.

Alex: how much is it really a problem? Can't you just announce the capability in your IGP and only start using the method when all routers support it?

Mike: yes but that means if you infect your network with one router that doesn't support this, you've broken the scheme.

Path Locking
Three epochs -- change discovery time, use transitional paths time,
lock to new topology time.
Potential transitional path types -- tunnels, safe neighbors, packet marking, u-turn
Sorting out the possibilities -- what are the criteria? Time to be converged (ballpark: 10 sec), simplicity, SRLG support (or really, unpredicted multiple failure coverage), no additional mechanisms beyond IP (may hurt coverage), common additional mechanisms for this and other advanced methods, also work for LDP.

Tentative conclusion:
- Incremental cost change impractical
- Sync'd FIB swap -- skeptical about practicality
- Ordered SPF -- long delay, poor SLRG support -- enough to be an issue?
- Path locking -- seems most promising, many possibilities (ed: but, maybe it's just that the newest toy is always the shiniest?)
- Haven't thought of any new methods this morning but we haven't been to the bar yet
- Need more brain power on this, more discussion

Danny: Is incremental deployability a hard requirement?
Alex: Yeah, and is 100% coverage required?
Danny: Sure but is incremental really a hard requirement?
Alia: Path locking can be done incremental. You can't have a flag day.
Danny: Well not a flag day, but it would be OK to require all routers to have same version of code before solution becomes viable.
Alia: But still need to worry about turning it all on
Andrew Lange: That's what maintencance windows are for.

Voch Kompella: Re sync'd FIB swap -- If requirements were externally provided (and included atomic clocks) problem would be easier. Are we making the problem harder than we have to because we are inventing our own requirements?
George Swallow: Are you only worried about clock skew during failure?
Mike: Skew isn't the problem, problem is skew in FIB install time.
George: So clock sync is not the biggest issue here actually.
Mike: Yes although I'm nervous about inter-layer dependencies.
Stewart: Well if you can detect that NTP isn't working then you can just disable the loopfree thingy.

David Ward: So we've asked for a collection of requirements but have no place to collect them.
Alex: Actually we haven't asked for requirements.
David: Wow.
David: How do you multicast?
Mike: General thinking is that you have to get the packet to the other side of the failure, can't just drop it off some place and use the unicast/downstream approaches because of RPF, etc.
Bill: Two halves to problem, other half is you need state to know where downstream neighbors are for mcast. So fast repair has to repair that state as well. You can get the packet to the other end of the failure OR get the join state down the repair path real fast.
Stewart: We're talking about for the repair, right? For the uloop convergence you have lots of time to fix up the mfib?
Everyone: Nope nope.
Bill: You're moving the tree around. PIM needs to get access to the new SPF topology before the new FIB is put into use, that might work.
Alia: At a minimum we have to not break mcast/make it worse! Secondary question is how to protect mcast too.

Alex: So getting back to uloop prevention...
David: Design team requested requirements, how are we going to provide them Alex?
Alex: Oh, thought you were asking about a requirements document
Alex: In particular SPs should try to respond to presenters questions/strawman requirements. SRLGs? Less than full coverage? These are important because they will drive the selection of mechanism.
Danny: Where ARE we going to record the requirements?
Alex: The mailing list?
Alia: The taxonomy doc?
Alex: OK.

6. Update on draft-atlas-ip-local-protect-uturn (Alia)[20m] 01:45

Changes:
- Explicitly marked packet identification (well known label?). Makes ID'ing potential U-turn packets easier, etc.
- Example algorithm for how to look for U-turn alternates. (Worst case is 1 additional SPF per neighbor.)

Next?:
- Simplify alternate selection
- More detailed explanation considering link protection

Other suggestions?
Other comments?

Slides

Agenda
Loop-free alternates
IP FRR MIB
Micro-loop prevention methods
U-turn alternates