2006-03-20 RTGWG Meeting Notes Agenda Bashing - no bashing Alia Atlas - IPFRR Status draft-ietf-isis-link-attr & LFAs - spec avoids using Max'ed out links for LFA's - new section on link attributes - link excluded from local protection path - local maint required - local protection available - LFA calculation: - avoid links with "excluded" and "maintenance" set Stewart Bryant: - replay to beginning: convergence question - we might want to use LFA in conjunction with an advanced technique, so shouldn't we specify that we do what is in the convergence draft rather than what is in PLSN Alia: - there is an "IF" in there, if you have another convergence strategy, such as ordered FIB, that draft would specify the interaction with LFA. =============== Pierre Francois on francois-ordered-fib-01.txt - reminder on ordered FIB updates - convergence time results. - comparison with PLSN - conclusion Principle: order fib updated to avoid transient routes each router computes a rank - rSPF rooted at Y gives the shortest paths to Y - during rSPF computation - R findes its rank, depth (R,rSPT(X-->Y) - max hop length among paths to R used to reach Y - R finds the set of neighbors that use it to reach (X-->Y) - waiting list of R (used to shortcut the rank) - R's FIB update time is Rank(R)*MAX_FIB - When ar router updates its FIB - it sendsa c ompletion message to its old nexthops for X. - When a router receives a completion message - it removes the sender from its Waiting List - When R's Waiting list becomes empty - R can update its FIB and send its completion message - Rank times recovers from lost completion messages. Simulation Results: measured the time to perform an ordered FIB update after a link-state change. - The flooding of the link-state packet across the network - link delay - lsp processing in router (4 msec) - the computation time of the rSPT (once LSP received) assumed to be 200msec - time requrired to update the FIB (100 microsec on cisco gsr). - First case study - GEANT research network - 22 routers, 36 circuits (72 directied links) few prefixes, one asymmetric link. --> chart showing Ordered FIB convergence time is similar to normal convergence time. Alex: what is this showing? Pierre: the 72 directed links. Alex: what is on the x-axis? Pierre: the idea of the link that is going down (on x-axis) Second Topology - ISP - 208 routers, 391 links, 85 asymmetrical link metrics, large # of prefixes. similar chart to the geant topology. - worst-case time is 861 ms - caused by 4 routers that have to do a full FIB update, each taking >100ms. oFIB vs PLSN: replacement or next-step? - PLSN (path locking via safe neighbors) - basic solution to provide loop-free convergence - does not provide 100% coverage - router updates its fib after 0, 2 o4 4 seconds depending on PLSN type of the rerouted prefixes. - Ordered FIB updates - complete solution - 100% coverage - a router updates its FIB after (rank * max_fib), in one shot - worst-case rank is the longest (in hops) path in the network. - sub-second convergence times can be achived with completion messages, rank time only applies if CM are lost. Conclusion: - ordered FIB provides sub-second loop-free convergence is ISIS and OSPF - would like this to be adopoted as a wg doc. ** Open Mike ** Ross: have you thought about how this would work with 1000 routers in one area? How well does this behave at the extremes? Pierre: If you have a huge topo, it depends on the shape of the topo. If there is a huge topo that is highly meshed, then things remain short. Doesn't depend on the number of routers, but rather the length of the path across the network. Stewart: We've presented data in the past with max network diameter of various networks. The relavent metric is the network radius from the failure. In that earlier data it was a max of 8 steps. What Pierre has shown is that we can get this under a second. This removes the issue we had about high convergence times. Only time this doesn't help is if you lose a CM. This lets you set the times to a better value, so you don't have to use worst-case completion times. Alex: what is the complexity of the completion messages? Pierre: when a link goes down the messages scale as the number of neighbors. Alex: scales based on single or multiple failures? # of topo changes? Stewart: each router can only receive one message from each neighbor. On the question of multiple concurrent failures, that is not something that we're solving here. Pierre: if you have multiple failures then you have # of failures x # of neighbors. Alex: not saying that we need to design for arbitrary failures, just saying we should examine each mechanism in this context, so we understand its behavior. Stewart: If you have >2-3 failures you should abort. But anyway the message overhead is going to be minimal. Alex: trying to see the properties Stewart: n failures x k messages in a few 100 ms Alex: In the past, there have been problems with message storms. Is this something that you see as an issue? For a router with a lot of neighbors, you'll have a burst of messages, what happens when some get lost? Stewart: if a message gets lost, it fails back to its rank timer. Also, the router will have to handle BFD, etc, which are much higher load. Alex: BFD might be processed on line cards, this will have to go to control plane. Stefano: Don't see this as much of a scaling problem. In today's platforms you can have higher transmission rates of hello's. Should we make the CM's reliable? Can we afford to allow message loss? CM's are a plus, but not absolutely required. Alex: I agree that the algorithm is robust enough to handle CM message loss. Rather, I'm trying to understand how often that will occur. How fast? How many in a particular interval? Stefano: This ought to become a wg doc. Ordered-FIB convergence is reliable and offers 100% converage. Alex: this is unrelated to the acceptance of a wg doc. Joel Halpern: ??? Alia: CM's help a lot, so having CM's is a key part of the proposal. Would like to see what the message processing, transmission and processing #'s look like. Pierre: if you have other suggestions, please send me the data and I'll run them. Stewart: we measured this on a real router. Alia: wondering things like 4ms for processing, that seems to not include stack passing times, etc. Pierre: recieves packet, passes things out, tried to add everything in. Stewart: real router, with detailed measurements. Alia: receiving on control plane. Stewart: receiving on wire. Alia: okay, that is what I was looking for, so there may be some other delays. Alia: How about SRLG's? Ranks based on # of members for SRLG's. Pierre: Worst-case is one message for each part of the SRLG Alia: how do you describe which link in the SRLG you're looking at? Pierre: some of that is in the draft with the packet format. Russ White: in the future, please list the info about the measurements in the next draft to head off these questions. Pierre: an implementaiton would be better, no? Russ: An implementation test would be even better. Ross Callon: as a recently-anointed routing area director, I have some concerns since we have had issues in the past. Would like to get IESG expert review early on this. Perhaps making it a wg doc would help. We don't have to answer all the questions in order to make this a wg doc. And it seems like the authors are headed in the right direction. Alex: Where do the authors want to take this? Stewart: When we know more about this and PLSN, we can make a decision between them. Alex: I guess, but since we're close to finishing the Basic IPFRR draft, we'll have to decide if we want to progress it along with some sort of microloop prevention draft. We were looking at Base IPFRR progressing with PLSN. Do we want to reopen that? Stewart: we should do both or subsitute. This would suggest we get better service from the destinations we cannot cover with IPFRR, with oFIB than with PLSN. Alex: What is your proposal? Stewart: we make this a wg doc and then let the wg make a decision. Alex: If you agree that this should become a wg doc, raise hand: @20 yes. If not ready, raise hand: 1. Consensus in the room to make it a wg doc, will take to the mailing list. Scott Bradner: the consensus in the room is "don't give a hoot" Ross: There are always lots of people who haven't read the draft, and don't feel comfortable making a stand. Alex: Okay, I'll discuss with the AD's and get back to the list. Multiple Routing Configurations: IP Fault-Tolerance using MT Routing - Tarik Cicic This is straright from academia, not really for standardization, unless there is a lot of interest. There is a patent application in conjunction with this proposal. There is no draft, but there is a link to the paper and presentation that was sent to the list. MRC: - Has guaranteed's single-fault tolerance for both links and nodes - Supports near-instantaneous local recovery - Need no information on whether a link or node has failed. Related to Multi-Topology routing. We have a second topology with certain links associated with them. For example, an mt with infinite cost would represent a failure. Details in paper appearing in INFOCOMM 2006. Isolated link concept introduced: Isolated link has infinite weight Isolated node all adjacent links have weights higher than any other links in the topology. Can be routed to, but never used as part of an SPT. Different MT's can be used to cover failure cases. Research shows that only 3-6 backup configuations are required to cover all failures. Research is being done on how to extend this to look at load balancing, multi- fault tolerance (SRLG), multicast node-fault tolerance, incremental topo changes and loop-free convergence. Joel: you mentioned this "you mark the packet". How do you mark the packet?? Tarik: I'm from university, this is less of a problem :) Joel: How do you mark the packet? MPLS? Not-via header? Is this interesting, new and different? something we've already looked at? Something we cannot do? Tarik: good question: one way would be to use a parallel address space, for example using 1918 space, similar to not-via, but a bit easier. If you only need 6, you'd only need a few bits in the header for marking. Rudiger: How many topology versions would you need if you were covering SRLG's? it is a practical problem. Tarik: In general, it is tough from a scientific point of view, to answer that question. You may get SRLG-protection from some backup topos. Mike Shand: Are you proposing doing this mt computation offline or in each router? Tarik: To do that in the router, you would have to have a common topology and id information. Could be done. Computational complexity is not high, it can be solved in polynomial time. Multiple of nodes and links in worst case. Pekka Savola: you had a slide about forwarding, have you looked to see if existing forwarding hardware can be adopted? Tarik: would love to get input from industry to see how this is done. Alex: Do you assume that the output of these calculations need to be maintained in the line cards and available for forwarding? Tarik: Not sure, imagine it would be mapped to the FIB somehow. Next Steps for the WG: - Base IPFRR - specs stable - pending implementations (have 2 of LFA (cisco, avici) - applicability statement? - Possible next-steps - multicast IP FRR - Advanced IPFRR (towards 100% convergence) - controlled convergence. Alex: Open for comments * pause * Alex: Okay, we'll take it to the list, or better... Alex: ** Walking down the row **: Jean-Louis: IPFRR for Mcast - important topic for tv-broadcast. Recovery needs to be good. Joel: Why? The TV glitches once and a while, so what is the application's need? We have glitches today, no big deal. Rudiger: doesn't matter what the application is, we need to deal with it. More important to get the first 80% solved on the multicast problem than the last 15% on the unicast problem. Joel: understand the problem that the base IPFRR draft is solving. There is unicast traffic which needs FRR, what is the multicast application that is so sensitive to delay in convergence? Jean-Louis: Multipoint video conferencing is one. Alex: IPTV is the biggest application. We need to do this, but the question is when? We have some techniques today, but the scaling needs to be better. Stewart: We can work on things in parallel. Different applications have different requirements. Pseudowires require the advanced IP reroute method, IPTV requires Multicast FRR Alia: should start thinking about multicast IPFRR. Not sure if the advanced methods are ready yet. need to work on both. Alex: How would you prioritize? Alia: Multicast will be more new work. Advanced a deeper analysis of what we have done. Mohan: Type of packet you lost is relavent, if you loose an I-Frame in the MPEG encoding the impact would be higher than if you lose Kireeti: Should we work on MPEG? Loa: I think we should wrap up the Advanced techniques first. Matt Meyer: we need to both, they might be paired in the solutions space. Andrew Lange: We should do multicast FRR first, we have nothing in this space. We can get more experience with Advanced FRR, and then revisit it. This experience will give us more perspective on the details of the advanced FRR methods. JP Vasseuer: Would like more IPFRR experience before jumping into multicast. Ross: If we get into multicast FRR, is this engineering or research? Arman ???: Trying to choose between problems of different scale. Multicast FRR seems like a lot more work. With advanced, there are proposals which could get a solution done sooner. Some chatting about IPv6 vs IPv4, understanding that these techniques are ip-version agnostic. Stewart: solving multicast might require tunnels, which pushes us to resolve advanced first. Pekka Savola: Can you get coverage percentages easily? Alex and Stewart: yes, either on the router or the management station. Alex: From what I understood from Pekka: if you give me a tool so I know the coverage, then coverage % is less important? Pekka: I want to know, with LFA, what can I do to give myself 100% coverage? Alex: Don't have that tool today, but it is something that could be built. ** Open mike ** Pekka: GTSM? Alex: after we got your message, I sent a poll out to get an answer. Suggest you and Dave take this together and bring it back to the wg. Pekka: Willing to work on it, but want to know if anyone has implemented GTSM TCP-RST handling. Alex: It would be good to have a draft by the next meeting.