2.8.27 Triggers for Transport (trigtran) Bof

Current Meeting Report

Spencer, for Spencer and Carl

------------------------------------

Triggers for Transport BOF (trigtran)

Thursday, March 20 at 0900-1130
===============================

CHAIRS:	Carl Williams <carlw@mcsr-labs.org>
	Spencer Dawkins <spencer_dawkins@yahoo.com>

Thanks to our minutes-takers:		
        Kevin Fall <kevin.fall@intel.com>	
	Tom Hiller <tomhiller@lucent.com>

Revisionism - Carl and Spencer were experimenting with the terms 
"connectivity interrupted" and "connectivity restored" in the 
framework draft and in the BoF, because "link down" and "link up" didn't map 
cleanly onto multi-hop subnetwork technologies, but we're going back to 
"link down" and "link up" as "clear enough" - and these terms will be used in 
the BoF minutes.

BoF Coordinates:

General Discussion: trigtran@ietf.org
To Subscribe: trigtran-request@ietf.org
In Body: subscribe
Archives:

www.ietf.org/mail-archive/working-groups
/trigtran/current/maillist.html

BOF agenda and description:

http://www.ietf.org/ietf/03mar/trigtran.txt

Final Agenda:

- Introductions, Scribes, Jabber Scribes, Agenda-Bashing

- Status Update and Strawperson Statement of Work

- Issues identified in Framework draft

- Questions leading to a WG charter

- Next Steps to a TRIGTRAN working group?

Status Update and Strawperson Statement of Work

-----------------------------------------------

There are two current Internet-Drafts produced by Carl and Spencer as 
straw-person proposals on a problem statement and a framework. These 
drafts are


	draft-dawkins-trigtran-framework-00.txt
	draft-dawkins-trigtran-probstmt-01.txt

This BoF is a "second BoF". The first TRIGTRAN BoF was held at IETF 55, and 
focused on the problem statement; this BoF focused on what we learned from 
doing a framework proposal and on next steps to chartering TRIGTRAN as a 
working group.

Spencer made the canonical TRIGTRAN disclaimers:

- TRIGTRAN isn't L2 triggers

- TRIGTRAN trigger delivery not guaranteed

- No notification ACKs; partial deployment scenarios

The framework draft considers 3 canonical triggers:

-	Link up
-	Link down 
-	Packet Discarded by subnetwork, not due to connection

"Link Up" seems easier in the short term.  There is some previous 
thinking on it: see Phil Karn's posting on "Kicking TCP" on PILC list 
3/2000m available at 
URL: 
http://pilc.lerc.nasa.gov/pilc/list/archive/0691.html

As for "Link Down", this notification "should" make sense. It does make 
sense intuitively. The problem is that the response might be 
application-specific - some TCP senders might wait longer to close a TCP 
connection, while others might give up sooner. Is there a canonical 
response? We need to understand the applications better before we can 
posit how to respond to this notification.

As for "Packet Discarded", we've realized that in a cellular 
environment this notification may result from a handoff - if so, the TCP 
sender would already have received a Link Up indication from the TCP 
receiver over its new connection path.  Conceptually, a TCP sender should 
slow-start when path bandwidth changes, but if the mobile has merely moved 
from a lightly loaded cell to another lightly loaded cell, TCP could just 
continue in its current state without going to slow start.  The new cell is 
as likely to be less congested as it is likely to be more congested, so 
ignoring the handoff, as TCPs do today, is still a reasonable 
strategy. Would telling the transport really help?

Spencer summarized a private conversation with Mark Allman as, "Gee, maybe 
TCP does pretty well often times on its own.  You may be able to find 
cases where you could do better with notifications, but by the time you 
make sure the response to a notification doesn't have undesirable side 
effects in other cases, TCP doesn't look so bad"

Co-chair summary Link Up seems ready for prime-time, Link Down may be in the 
near future; Packet Discarded seems less ready.

Spencer (and Carl) proposed a strawperson Statement of Work:

- spin up working group and do 'link up'
	Specify response for dup pkts during RTO
	Safe, good consensus

- investigate 'link down' semantics
	What should transport do here?  

- investigate explicit mechanisms
	Can we send OOB signals in the Internet in 2005?

- build consensus on wish list for IAB/IRTF investigation?
	Horizontal handoff, non-congestion loss, corruption
	Send wish list to Leslie and Vern and ask for help (this is something 
mobileIP and others are doing)

-- Q and A --

Sally Floyd: terminology: why "connectivity restored" vs "link up"? One 
link up might not have anything to do w/e2e connectivity

Spencer: You're right — the previous 'link up' term is more 
appropriate

Bernard Aboba: need to be clear about the meaning of 
notifications. There's a distinction between link 'up' vs 'fully 
functional' - a link can be "up" but may not provide very useful 
connectivity. It can be quite tricky as to  just when you send the 
trigger.

Spencer: Would notifications about nominal BW changes be useful?

Bernard: I was just thinking re link up; but if I was told of a rate 
change that doesn't seem so obviously relevant

Kevin Fall: you may have cases in which the link is 'up' but error rate is 
rather too bad to be very useful.  This has been explored some time ago in 
SCPS.

Phillipe Gentric: knowing a rate change could be quite helpful for 
streaming media

Spencer: Maybe a bandwidth change would be some form of maximum [a speed 
limit through a link], so the indicator is essentially saying you might as 
well not go faster than that to the transport.

Allison Mankin: maybe we are out of scope here; we don't really have links 
that tell us their max bandwidth at the moment

Sally: It's not obvious what scope should be; some things have been 
investigated but others haven't been.  At a minimum, maybe TRIGTRAN could 
write a problem statement. This group could define the trigger; sending the 
entire problem to a research group may not be enough, but a problem 
statement might well be appropriate.  

Eg. DCCP partial checksum issue came up.  Should you allow corrupted data to 
be delivered to apps [or to lower layers].  Or should it be marked 
somehow as corrupted.  Where would this sort of issue be addressed? 
perhaps here, or perhaps somewhere in IRTF.  

Another: as we build transport protocols robust to re-ordering, [and there is 
no reason they couldn't be]--- would you want a way for the transport to 
tell some lower layer that it robust to re-ordering so feel free to 
re-order.  So, I don't know.  Not clear where speculative, 
fleshing-out activities should take place.

Spencer: We have been trying to maximize the amount of sanity we appear to 
have.  We were trying to stay grounded in minimal notifications, not to 
spend time thinking about what to usefully speculate about. Now that we've 
been through a strawperson framework, maybe it would make sense now to add 
some of these speculative things to the discussion topics.

Mark Handley: Various forms of these issues have arisen over the years.  We 
don't have good institutional memory.  If we investigate why *not* to do 
something, we should write this down somewhere, so we can remember why we 
*don't* do things.

Spencer: Yes, I found some things like this looking over older 
discussions.

Allison Mankin: We have a group of community members in the IETF who are 
largely not researchers; there are some interesting questions (e.g. how 
long to wait, what to do, is it really up or down, etc).  Just getting this 
right is useful – there is engineering to do with issues like "link 
flapping".  We aren't inventing a research group here.  I'm trying to not 
sound too 'top down' here.  We would not want to include the researchy 
stuff in the charter.  Probably having a wish list possibly for another 
group might be good, but we should not be doing work such as 
congestion handling here.

Bernard: The more I learn, the more I would like to have a document that 
describes the general characteristics of various links.  A link 
tutorial for the group would be useful.  Of the existing literature, what 
are the results and conclusions?  These are not static 
characteristics but dynamic ones.  I take this to mean a survey 
document.  This would be the first item of order to complete. It's not 
exactly the same stuff that PILC did.  Maybe different than PILC because 
here we are concerned with signaling, which is more dynamic.

Allison: You are suggesting a sort of survey doc?  That would be part of the 
charter[?] hmmm.

Spencer: TCP-SAT WG produced a doc that summarized the research topics.

Allison: yeah, maybe ok.  We would like to have a well and narrowly 
focused group.  This is my answer to Sally.  Now we have Bernard's 
suggestion.

Spencer: Maybe we could do this in parallel [both the research and 
engineering-ish documents]

Aaron Falk: [former chair of TCP-SAT] I'm not sure that the research doc was 
so useful.  I wouldn't be opposed to seeing it again, but wouldn't want to 
take WG resources away from other tasks.  There are all sorts of 
subtleties.  The charter must be narrow and well focused. I would rather go 
deep than broad.  I don't see that a TRIGTRAN research mechanisms 
document is a great investment of resources.

Gorry Fairhurst: there are many things we could signal, but we don't know 
what we would do.  I think doing a big survey would unearth lots of stuff we 
probably don't want to deal with.

Spencer: Doing the framework revealed a fairly rich set of things that come 
up.

Bernard: Given a media, how reliable is an indication?  This sort of 
documentation could be quite useful.  So, a [control] system observing an 
indication has to have some understanding of what to believe.  This 
applies to rates also.  The question is how reliable is it, how much 
skepticism should be applied when receiving a trigger.

Spencer: Carl and I were assuming it would be helpful to know what 
different subnet technologies had the capability to send.  Bernard is 
perhaps stating that the mapping of link condition to trigger could be more 
formal than we had been thinking.  Hmmm.

Allison: Maybe an appendix could include examples that could be 
cautionary or provide examples.  Gathering those would be good. The 
working group could contribute these.

Vince Park: I agree w/Bernard.  It is difficult to decide what to take as an 
indication and how to interpret it.  Probably a good idea.  Has come up in 
mobileIP, MANET, and others.  Ultimately you have to decide how to map L2 
type information into a higher-layer indicator. May be bigger scope than 
TSV-only.

Ethan Blanton: for some triggers we aren't sure we need a 
*protocol*.  Not sure it is worth putting that much energy into a 
protocol until we are sure we need one.

Aaron Falk: We're not just talking about the format of "bits on the 
wire". We need to define the semantics of what notifications you get and how 
you respond.  None of that stuff has been hashed out.  I believe some of 
these things may be valuable, but maybe there are cases where there are 
problems.  There is work to do there.

Ethan: I didn't mean to indicate that we shouldn't do this work.  Just 
maybe a first-order item may be not to define the protocol, but whether or 
not a small subset of triggers are needed given certain links.

Spencer: I would like to agree.  The mechanism appropriate for link up is 
different from every other notification we've talked about.  If all we have 
to do is link up, we don't need the whole framework.  Is this 
responsive to your question?

Ethan: yeah, I think so.  If you carve out a space that doesn't need a 
protocol, you could still be concerned with what to do.

Rick: from Mobile IP, we have been playing w/L2 triggers.  Getting their 
semantics sorted out is important.  We may need them for all sorts of 
things.

Mark Allman: I'm not so concerned about 'link up' per-se, but exactly what 
you should do with it should be taken with some caution.   (e.g. DCCP with 
this indication would do what exactly?)

Spencer: We've been encouraged to look at other transports (e.g. SCTP).  Now 
that DCCP seems to exist, we should be looking there too.

Sally: I agree the question is open as to implicit vs explicit signals and 
that we need definitions for explicit ones and descriptions for 
implicit ones.  Explicit signals might not be the right answer.

Mark Handley: Does the sender of the signal need to know about all future 
transports?  This is a complex space—interactions with lots of things 
[tunnels, security whatever].  Need to be careful.

Spencer: My experience is that transport area is cautious.

Randall Stewart: I also don't know what I would do on link-up in SCTP 
stack.  Other than (some) relationship to heartbeats, I'm not sure how 
useful this is.

Spencer: I had thought TCP people want link up and SCTP people want link 
down, based on conversations at IETF 55.

Randall: I feel somewhat uncomfortable with link down given the 
possible DoS issues.

--- Issues in the framework draft—

Aaron Falk suggested that we focus this discussion on Link Up, because we 
haven't gotten consensus on another notification that's compelling

Spencer: link-up needs very little framework, and other stuff isn't baked 
yet. The framework document is not really a completed framework.  We are 
exploring an approach.

Aaron: Spencer, I'd like to suggest that we might have a consensus in the 
room that there is strong support for link-up and maybe some 
opposition to others.  If you accept that, the framework is somewhat 
superfluous in the context of just doing link-up.  Take a poll of the room to 
see.

--- Agenda and WG charter and scope issue ---

Spencer: I appreciate your suggestion re going forward on link-up.

Aaron: What might be an acceptable scope of work for this WG is the 
question.

Spencer: How about link-up?  Is there is consensus that working on 
link-up would be just fine?

*** Room Consensus was that Link Up is just fine ***

Aaron: Does anyone this think would be a *bad* idea?

Spencer: Yes, and please tell us why?

Spencer: If we charter as a WG, I'm looking at whether this would be in 
scope.

Joe Zebarth: It appears to be premature if this will move beyond a BOF.

Spencer: What questions remain to be answered?

Joe: Are you proposing a new protocol or mod to existing one?  If I'm 
sending UDP say do I get one of these triggers?  Are we modifying TCP or 
doing a new protocol or what?  We need more info to agree or disagree for a 
WG formation.

William Ivancic: There are unanswered questions regarding the 
usefulness of link-up. This isn't IETF-ready yet.

Spencer: The criteria for having an item in the WG charter is not that we 
already know the answer, but that we are pretty sure we could come up with 
one.

Dick Knight: I'm still confused.  What is the clear problem 
statement?  Most of this link layer stuff will sort itself out.  I'm not 
sure why you are working this problem.  Is this going to work in an MPLS 
network?

Reiner Ludwig: To those people who said 'no', people aren't convinced we 
have a problem.  Various people don't know what to do with it.  I don't 
think people in this room are convinced there is a problem.

Aaron: I think part of the problem is that not all these people have been at 
the places where this has come up before and that the drafts don't fully 
capture these experiences.  E.g. the reference to Phil Karn's stuff in 
PILC.  The people here that are familiar with the PILC document are not the 
ones that have objected [except Reiner]. Responding to Reiner in 
particular, there are some questions regarding SCTP, DCCP, and TFRC.  One 
way to go is to take the suggestion from the PILC doc that you 
re-transmit packets at the routers [which doesn't require changes in end 
stations].  Part of the problem here is not everybody understands all the 
issues.  We may need to talk about this—there isn't a consistent sense as to 
what is on the table.  2 pieces of the problem:  should IETF be trying to 
solve this, should the IETF try to solve this and what is the form of the 
solution so people can see it is not dangerous.

Spencer: I do have a couple of slides here to present that might help.  
Phil Karn's proposal is best described in a posting on the PILC list from 
3/2000.

Mark Handley: Trying to summarize.  A general strategy used in many of our 
transport protocols is exponential backoff.  You would like to not have to 
wait for exponential backoff if you know the link has actually failed.

Spencer: So, like in TCP, this could be quite some time [to wait].

Bernard: I think there is some problem getting the timer to the right 
point.  Eg. There can be rate changes between media [e.g. between 802.11 and 
GPRS].  Question is how long it takes to adjust.  Also, sometimes the 
adjustment can be 2-3 orders of magnitude.  The first thing you need is a 
well-defined problem; next is what does the literature say?  Does this 
literature solve the problem, or is something else needed?  If you go 
through this logically, then get to the end and figure out whether you have 
solved it [or not].  Then you have a logical problem statement and you 
figure out what is left.  (e.g. if the receiver changes rate 
dramatically, does or how does the sender know)?

Spencer: First thing I'm going to show you now is generally about the 
framework; next is more specific to TCP {"kicking" TCP}.  Kicking TCP 
doesn't appear to be so difficult, but I'm afraid if we go beyond this we 
need much more organizational machinery.

Mark Handley: please don't talk only about TCP; the exponential backoff is 
ubiquitous for other transports too.

{Explanation of slides – lots of details here, see the slides, in 
particular the one titled "If we really 'kick TCP". Phil Karn, "Kicking 
TCP" March 2000 PILC list posting reference appears at the top of these 
minutes}

Sally Floyd: I would feel more comfortable with a problem statement 
before the framework.  Basically what Bernard has suggested.  I would feel 
much more grounded to start there.

Spencer: I agree completely.  We did some problem stmt work up front; it 
would be appropriate to continue from there.

Bernard: these indications have effects on things other than TCP.  For 
example, maybe I need to update my address lease.  So, that would 
certainly change things if I lost my address!

Allison: This is the third in a series of BOFs that seems to be trying to 
not generalize the overall triggers problem.  Is this part of the 
Internet area; seems that it isn't happening there either.  Nobody seems to 
want to own it.  I have a general question — I should ask "how pressing is 
this problem"?  How often do we get caught in this state?  Maybe this 
inconvenience isn't worth the effort [like building the WG].  Maybe we 
should take a poll as to whether you have this problem?

Spencer and Carl: Hand poll was maybe 30 people or so that felt they had 
this problem.

Reiner: I'm looking at TCP over wide-area wireless networks (cellular 
2.5G/3G, mostly GPRS and WCDMA; not SAT though) for a couple of years now, 
and I know of many real world measurements, not all of which have been made 
public. But, the problem of a TCP sender that is stuck with a 
backed-off RTO due to some transient link outage is rare in these 
environments. It does happen occasionally, though.

Bernard: it isn't a big problem on 802.11, and again you have to be 
really careful about how you treat the indications [what sort of filter you 
use]

Allison: so we might say don't do this for 802.11.  Now I think we take a 
charter proposal, unless there is some objections?

Alex Audu: is there an option to do both link up and down? 

Spencer: Well, no, not yet

Alex: Are we going to ask this question?

Carl: Does anything other than link-up make sense for a charter?

Kevin: there is an observation that these indications can affect other 
layers of the stack.  If we charter narrowly, where will this be 
addressed?

Tom Hiller:  there is some advantage to link down in 3GPP

Spencer: We have been assuming these indications are pretty much 
un-authenticated.  So therefore what they can indicate is limited.  So we 
have been living under this constraint.

William: but can't I have similar issues with link up?

Spencer: I wasn't so concerned with getting a link up than a link down.  It 
doesn't seem as risky as link-up - the network effect of link up is the 
same as an RTO timer popping.

Mark Allman: it's all in how you scope the response to an 
indication.  Link-up could be dangerous if you blast.  Not if you are 
conservative, I would think.

Spencer: Comments on sending some of this to the IRTF?

Kevin: Who covers things covering various layers?  Is it possibly IRTF?

Aaron: Kevin, maybe this could be a result of just doing it.  The 
process can be adapted.  I'm an advocate of being closed and well 
defined.  I'm increasingly unhappy with how long the turn around time is on 
documents.

Spencer: Is it worth talking about "non-engineering" things here?

Sally: It seems like there are architectural issues here.  Having an IRTF 
home for this seems perfectly plausible to me.  Whether it is an IRTF 
group or just a bunch of people doesn't seem so important.

Spencer: What we are talking about is can we discuss things that might be 
outside the WG?

?? – I know we've talked about link up and link down.  What, in total, are we 
talking about?

Spencer: In my complete list, including researchy topics, I had 
described link up, link down, packet discard, possibly horizontal or 
vertical handoff, and maybe indication of corruption losses

Sally: I think a group of people writing about particulars could also 
write about more arch/speculative things too

Gorry: We touched on timescales; we've moved to link up/down, ok.  There is a 
semantic gap to what the link understands and what we are looking for.  But 
we need to understand the filtering better.  There is work to do that is 
technology dependent.

Allison: This WG is not going to spend lots of cycles trying to have 
architectural discussions.  It is going to spend its time working on the 
work.

Spencer: yes

Allison: It's going to do the work, right?  Don't spend too much time 
carefully placing the work [that we don't intend to do here] somewhere 
else. 

Spencer: Allison, can we go forward?

Allison: I'm going to wait for a charter proposal from you.  It should 
include milestones, etc.  I think we have gone as far as we can go so far.  
No further progress can be made without the proposal.  My sense of the room 
is that the work is compelling.  I don't need any more info. That was the 
last question I needed to answer to proceed.

Spencer: Are we going the right direction toward a charter?

Allison: You need to say this is a real-world problem.  Scope the work 
carefully.  Charters are contracts.  Want to make sure the folks here are 
committed to doing engineering work on the problem.  Mark Handley 
reminded me that in the PILC Advice to Internet Subnetwork Designers 
(LINK) Internet-Draft there is (almost) a BCP how routers might use link 
indications to hold packets when links go down.  Be careful about 
security considerations and scope.  There has been a lot of chatter in 
small groups regarding lightweight semi-secure mechanisms for 
protection w/out full crypto.  Then we show a WG charter proposal to 
IAB/IESG, chairs participate in this discussion, then to the full IETF.  
Then we have consensus of everybody, then we have a contract.  Then we hold 
to it very closely.  By then it has IAB review and IESG approval and is 
advertised to others.  This then might involve liaison with other 
standards bodies (IEEE in this case seems interested).

Spencer, after closing remarks:  Has anyone looked at the purpose built 
keys (PBK) draft?


Slides

Agenda
Framework