Internet Engineering Steering Group Face-to-face Retreat

The IESG met in Half Moon Bay on May 1-2, 2007 for a face-to-face retreat.

All area directors participated in the retreat. Dave Thaler participated as substitute Internet Architecture Board (IAB) liaison, and also contributed a draft of his report to IAB for use in preparing this report..

Bill Fenner participated in discussions about the IETF toolset, and Michelle Cotton (IANA) attended via teleconference to participate in IANA discussions. In addition, Ray Pelletier (IAD) and Barbara Fuller (Neustar Secretariat Services) attended for administrative topics, and Spencer Dawkins <spencer@mcsr-labs.org> attended as narrative scribe. A draft version of these minutes wasreviewed with the IESG before they were published.

The high-level goals for the retreat were to bring new ADs into the IESG culture, provide opportunities for teambuilding, and to make progress on agenda topics.

The agenda  was as follows:

May 1
  morning session
         -- Tracker Orientation / Tracker State Goal Review (Russ)
         -- Ballot Positions and Consensus (Lisa)

  afternoon session
         -- Recusals and Conflicting Roles (Sam)
         -- Project List Review (Jon)
         -- NomCom Chair Recommendations (Russ)
         -- Metrics Documents (Lars & Magnus)

May 2
  morning session
         -- IESG Input to NSS Performance Review (Ray)
         -- Narative Minutes: Meeting Community Needs? (Russ)
         -- IETF WG Liaison With Other SDOs (Magnus)
         -- IESG Input to Tools Management Committee (Cullen)
  afternoon session
         -- IANA Multicast Expert Assignment (Ron)
         -- IANA and RFC Editor Relationships (Russ)
         -- ROAP "Buckets" (Jari & Mark)

Tracker Orientation / Tracker State Goal Review (Russ)

Russ and the experienced ADs provided some training for new ADs on the toolset, and mentioned the wishlist mailing list for enhancements to the ID tracker and related tools.

Cullen noted that PROTO shepherd writeups vary widely in quality. Since the PROTO shepherd process is fairly new, this may be an EDU opportunity for training.

The IESG adopted document state change goals about two years ago that would give us a 60-day average, and the IETF not meeting those goals. Details at http://rtg.ietf.org/~fenner/iesg/perf/goals/, but document handling is averaging about 200 days, and this has been stable for several years. Some issues are AD, some are IESG, some are WG chair/author, but the combination gives us a 200-day average. Cullen reported his "best effort" value of 54 days, and doesn't think that any real document can be processed faster. One problem is that many drafts are sent back for revision from AD review (Sam says about 50 percent, and Jon says "higher").

The discussion also brought out differences in the way different ADs advance state (whether a document moves to "AD review" when the AD first sees the request to publish, or remains in "publication requested" until the AD actually starts to review the doument, for example).

Sam noted that it's important for WGs to see significant changes to documents on the WG mailing list before documents are published..

Notes about IETF Toolset

As part of this training, we noticed the following bugs/features for the tracker:

Ballot Positions and Consensus (Lisa)

Lisa (and Darwin) led a discussion on how balloting really happens for documents.

IESG works to avoid "disconsolate DISCUSS" - "this document is wrong and there's no way you can fix it so that I won't ballot DISCUSS". ADs may, but are not required to, provide suggested text, but DISCUSSes need to be actionable - even if the AD doesn't provide text, the AD should provide guidance.

Not all the ADs read all of the documents all of the time, and ADs are not expert on all technologies. "No-Objection" says "this document had sufficient review", but some ADs ballot NO-OBJ based on who else has reviewed the document - the concern here is that a sponsoring AD votes YES, one or two DISCUSSers move to ABSTAIN, and everyone else went NO-OBJ - so the document is published, but the ADs would have objected if the document wasn't already in DISCUSS. There are other ways to ballot extremely problematic documents - instead of moving to ABSTAIN, either remain at DISCUSS and request an override vote, or simply appeal the publication of the document.

Some ADs want to vote "YES" more, but ADs believe that balloting "YES" includes checking ID Nits and references, and this can be tedious. The ADs discussed moving to a worldview that "YES" only includes this work if you're the sponsoring AD.

An increasing number of ballots are being approved without positions from all ADs. This is not the same as "NO-OBJ".

The ADs don't have consensus on whether to review every document, don't have consensus on "empty votes" (no position entered), but do have agreement that ADs are responsible for ensuring cross-area review and do have some consensus on NO-OBJ. Jari mentioned the need to be better aware of WHY other people balloted their positions, and take this into account for drafts in infamiliar territory.

Second session on Ballot Positions and Consensus

Most of the second session focused on whether IESG wishes to use full consensus for considering appeals. Sam does not believe that we have a good procedure for dealing with objections to anything EXCEPT document ballots, and that we've run roughshod over objectors when considering appeals - this is the only use of the override balloting procedure, for a PR-action.

While there's a point of diminshing returns, there's also concern that ADs aren't just resolving an issue, they're building a team, and having ADs in the rough on rough consensus doesn't help build the team.

The real problem is that all ADs sign appeal responses. If it was possible to enter a dissenting opinion, that might prevent deadlocks.

More than one AD expressed considerable frustration at the amount of time consumed in considering appeals in the past year or so, although things have gotten quieter now.

Recusals and Conflicting Roles (Sam)

The IESG spent some time on recusals and conflicting roles. There aren't any written guidelines on when to recuse and when not to, and there's a real tension between wanting to avoid even the appearance of conflict of interest and realizing that ADs from large companies are probably affecting at least one sponsoring company's product with MOST of their document. A simple "does this affect your sponsor company or organization?" test  is too simple, especially when a large company may have business units with conflicting interests on a specific technology. Recusing as an author is straightforward - an AD would recuse.

There are legal requirements here, so although IETF people participate as individuals, and not as representatives of a company, it's necessary to think about both reality and perceptions. If you would ballot differently with a different employer, you have a conflict of interest. If you would be embarrassed explaining an action at an IETF plenary open mike session, you may have a conflict of interest (this was referred to as "the red face test"). The RTG ADs mentioned specifically that they are bending over backwards to avoid the appearance of cronyism in WG chair selection.

Sam feels that technical issues are actually more straightforward than other issues, especially appeals. If an AD's action is being appealed, that's straightforward, but what about another AD who has participated in the discussion being appealed? Does that AD recuse from appeals? Recuse from discussion of appeal? Recuse from confirmation of the appeal?

Cullen has worked on conflict of interest policies, and they are amazingly difficult. If engineers work on this, they come up with algorithms. Algorithms don't work for ethical decsions, but a more general "code of conduct" might be useful. The IESG doesn't have time to work on this topic, but probably should find someone who does have time. One possibility discussed was asking the IAB to author a "code of conduct" document. It was noted that "doing what normal practice is" is helpful legally, since you're following the community's definition of what's appropriate.

Sam liked the suggestion that the ADs place management items on telechat agendas for decisions with possible conflicts of interest and ask for guidance from the rest of the IESG.

Project List Review (Jon)

This was short - the room consensus was that the project list has outlived its usefulness, and will be removed from future IESG telechat agendas.

NomCom Chair Recommendations (Russ)

Based on Andrew Lange's report from the current NomCom.

NomCom timelines in RFC 3777 are unrealistic - NomComs do not have enough time to do their jobs, and the timeline causes additional problems with end-of-year holidays.

Several NomCom chairs have suggested starting earlier - in this case, two months earlier, so that ISOC would select the NomCom chair and start volunteer and selection process during the second IETF, with a call for nominations as soon as NomCom is selected. This would also allow the NomCom to speak with candidates at the third IETF.

Cullen (current IESG liaison to NomCom) also provided some feedback from his experience this year. Cullen noted that NomComs don't have good "boundaries" for liaison "participation", and there are no guidelines to ensure that liaisons from various organizations do not represent the same employer (actual voting membership is limited to two per sponsoring organization in RFC 3777).

Russ said he wished NomCom had selected the IETF chair FIRST ("the IETF chair should know he's the chair before he gets on the plane to the IETF"), but there were a lot of objections to this idea. The basic problem is that confirming bodies are trying to achieve a "global optimum with balance" for an entire slate, picking a team that will work well together, and each person confirmed in isolation reduces the flexibility the confirming body wants to keep. IESG's understanding is that IAB thinks RFC 3777 requires this.

The "short list" was discussed. RFC 3777 requires appropriate confidentiality for the list of candidates and for people providing feedback. It is good to seek input from people who are active in the IETF, but this includes a LOT of people. The short lists were sent to a total of about 1500 people, and even when non-candidate "ringers" are included, this leaks like a sieve. In theory, the candidate list could be open, given that it's already given to 1500 people, but sending it to 1500 people was a process violation of RFC 3777, and should have been appealed.

Cullen noted that people need to know they have funding to serve, and this takes time - NomComs need to pick a date when candidates confirm they will serve. The previous NomCom still had "maybe" candidates when it's time to send out the short list.

More than one AD reported that NomCom interviewers may inadvertently confirm previous feedback ("yeah, we've heard that from other people, too").

Incumbent statements: There was a discussion on whether incumbents should be asked/encouraged to declare whether they would re-up.  Cullen said the NomCom members thought they'd made a mistake by asking ADs to confirm willingness to serve publically this time - the concern is that saying "I'm willing to serve" gives the incumbent an unfair advantage.

It was suggested that making a public statement that a sitting AD would serve again is unlikely to have much effect other than discouraging other interested folks, which reduces the ability to get a sufficient pool of candidates, especially when there may be transfers between positions.  One incumbent suggested replying in response to anyone asking if they would re-up: “I haven’t decided yet, so go ahead and volunteer”, and people seemed to think that was a fine policy.

Russ noted that John Klensin had a draft on NomCom process changes, and Sam Hartman said there were many fine ideas in that draft (http://ietfreport.isoc.org/idref/draft-klensin-nomcom-term/, now expired).

After some discussion, Russ is leaning towards two approaches to NomCom process change - an individual draft that does nothing except alter the timeline so that work begins two months earlier, and a highly focused effort to fix one problem at a time after the timeline change is approved.

 (Thanks to Dave Thaler for text included in this summary of the incumbent statements discussion)

Metrics Documents (Lars & Magnus)

We have two WGs that are supposed to be using an IESG document for metrics guidance, that doesn't exist. bradner/mankin/paxson have written a draft, but it was never published as an RFC. ADs are seeing BOF proposals for media quality metrics, so the work in this area isn't going away. It may be appropriate to publish this document as an ION.

There was some discussion about how to advance metrics specifications on the standards track, because (1) the implementations don't actually interoperate, and (2) metrics specifications and MIBs for a protocol may be much more stable than the protocol itself, but there's no agreement on whether it's OK to have a SIP metrics specification at draft standard when SIP is at proposed standard, and (3) not many specifications advance beyond proposed standard anyway, so it's not clear that we even need to know how to advance metrics specifications - implementers just want them to be "stable enough to implement".

Ron and David Ward expressed concern that a protocol model has to be complete - including a router model itself, a network model, and CLI load - because people use these metrics to evaluate products.  David pointed out that we have 20 years of routing system dynamics papers, and no two are comparable.

GROW is being rechartered to look at control plane dynamics, but GROW always publlishes Informational RFCs.

Control plane metrics are more complex - "time to converge" is not a box-by-box concept, and the current state of the art is to inject bogus routes into the Internet and watch them "ripple" using looking glasses. Results are irreproduceable.

There are many ways to structure the work - as a WG, as a directorate, as a review team ...

Dan will take the lead on this topic (which covers four areas), and will discuss offline with other ADs.

IESG Input to NSS Performance Review (Ray)

This discussion was not minuted. Barbara and Jon (both from Neustar) went to the donut shop while this discussion took place.

Narative Minutes: Meeting Community Needs? (Russ)

The IESG began producing narrative telechat minutes in the fall of 2006. It's time to checkpoint and see whether this is worth the significant effort it takes to produce narrative minutes.

ADs have not heard many questions about something that was documented in narrative minutes, but Cullen noted that usually when they do hear something, it's complaining about something that's not working, and also pointed out that accusations of "black helicopters" a few years ago have pretty much disappeared.

Some ADs use the narrative minutes to "catch up" if they missed a telechat, or to track action items, but others don't, and don't always check their own comments for correctness, either.
 
Barbara has the action to report the number of hits against the narrative minutes web pages, and then put something on a future survey. Russ wondered if we are advertising narrative minutes well.

IETF WG Liaison With Other SDOs (Magnus)

Distribution of liaisons with other SDOs varies widely between working groups and between areas. If a WG gets lots of liaisons, they probably know what to do with them, and working groups that never get liaisions don't have to worry about them, but WGs that get them occasionally can spend a lot of time for very little result tryiing to figure out what to do.

New documents describing liaison handling have been published recently, and this would be good to mention in WG chair training, the WG chairs wiki, and perhaps on the WG chairs mailing list.

Spencer Dawkins had the action to point this concern out to the EDU team (and did so in real time).

Russ pointed out that WGs need to ask themselves if the relationship is important enough to justify spending the time, and Magnus pointed out that this is why  IAB is restrictive on new relationships, because establishing them forcing IETF to respond to liaison statements (Dan also confirms this).

IESG Input to Tools Management Committee (Cullen)

Bill returned to IESG for this portion of the retreat.

There is a list of proposed tools, but Cullen is looking for input. It's not easy to know what tools have already been requested - Bill says there are requests in RTs, a spreadsheet, and an appendix!
 
The working group management tool handles chartering, milestone updates, etc. -  would handle the part of the process that doesn't require AD approval.

Bill mentioned that he has some algorithms for  "good enough" workgroup slot scheduling. Of course,  working group scheduling needs to work really well, without breaking the tools.

Finding all the e-mails sent as Last Call responses would be really nice. Bill has come tools to do this, and Henryk's tools would be a good basis for this. Sam suggested a looke at  debian bugtraq system, because it is incredibly good with e-mail.

The IESG also discussed Bill's personal tools (used for tasks like building the agenda for informal telechats, and noted that ADs can't put a draft that's not in the tracker on an agenda, and can't put general topics on the agenda. Other requests were:
  1. magnus - could you put some separators in?
  2. russ - could you generate the dates as well?
  3. chris - a tool for management items?
  4. sam - a tool for something that's not a document, but it even has a ballot
  5. russ - assigning experts? does fall in the same class
One observation - something that's on an agenda, may be discussed, but may not be balloted. Sam things that  if we had ballots for WG rechartering, they'd probably get a lot more review.

David Ward is concerned about losing e-mails on WG last calls - people lose responses even on mailing lists with threaded archives. This could be handled similar to IETF Last Call tracking.
 
Bill pointed out that there's also a wishlist request for a tool to send all the comments/discusses in one e-mail to people (would also help with followups).

Bill noted that when the ID Tracker appeared, DISCUSS feedback was always anonymous, but this has changed and names on DISCUSS/COMMENTs are in the tracker, so no need to anonymize these in e-mail.

We noted that the  RFC Editor is talking about pointing to the IETF toolsite.

Dan made a request to open  a comment when you enter ABSTAIN ballot (like DISCUSS). If there are so  many ABSTAINs that a document blocks, we should be giving the authors reasons why.

The IESG noted that  Henryk is spending 50 percent of his company time on tools - as much as some ADs spend on being ADs - plus additional personal time...

There are two levels of prioritizing - across the IETF, and across the IESG. The tools team does the first, but not the second.

There were also discussions about the logistics for a "coding party", either at IETF 69 or IETF 70. Cullen says that  python design tools will make this easierr, but the database back end is separate topic. The tools teams needs a couple of key people who know the architecture to make progress. Cullen has action to talk to tools team, etc. for next steps

IANA Expert Reviewer Assignment

The IESG reviewed 14 IANA requests for expert reviewers, and named reviewers for almost all the requests.

The following open issues were discussed.
 
- IANA procedure for multiple experts for redundancy, two reviewers (for a base protocol and for an extension of that protocol) - is that OK? as long as the reviewers work well together, and Michelle will escalate if they don't
 
- EGLOP and multicast addresses - well-known addresses and application-specific address space, and requesters have to explain why they can't used admin-scoped, GLOP, etc. Not many requests, but there are some, and they are very painful (big per-request argument). We have good visibility for requests that happen, but not requests that are denied. Russ is interested in our MoU with IANA (that we don't touch address assignments) - thinks we can recommend an expert, but IANA would actually name the expert - but that's not what the advisory text is ("appointed by responsible AD").

IANA and RFC Editor Relationships (Russ)

The RFC Editor contract is expected to be fully executed in the next week.

IANA has been working toward a new Service Level Agreement. IANA performance has greatly improved since the SLA was been put in place.

ROAP "Buckets" (Jari & Mark)

This discussion focused on what work will take place, and where it will take place.

Dave Ward on router discussion

Dave gave his perspective on what the problem isn’t (routing table scalability), and what the problem is (power) and why, along with actual statistics from “anonymous” router vendors. To summarize.- for the next 10 years, we have no problems with routing scalability. This isn't the position from Amsterdam., but Dave wasn't in Amsterdam.  (most of Dave's slides were not minuted as background)
 
People should be sending their views to the Routing and Addressing Directorate, and make sure it makes it into the problem statement.

Mark pointed out that the  perception is that FIB size equals cost - Dave's slides debunk this, but the perception is still there with people who buy routers.

Dave Ward on Routing Protocols/BGP

This was the"BGP is not that bad" presentation, on multiprotocol BGP
 
VPNs would be a problem for routing scalability, but not all VPNs are present on all routers
 
>From Dave's perspective, even routing experts are confused about BGP dynamics, because BGP runs over TCP (intuition about TCP itself is usually wrong). Dave is  seeing amazingly small BGP4 update packets - 2 prefixes in an update, max is 4KB - and that's fine, because the ratio of updates to packets goes up when you're congested
 
Dave will also be giving a longer version of this talk at RIPE next week.
 
"The Problem" is moving from the size of the FIB to convergence. From Dave's perspective, the IETF is  using energy we have to do what we've needed to do for 8 years (which may not have anything to do with FIBs).

Lars asked about moving away from TCP for BGP transport - this is too big a change to be considered "incremental", and the protocol is too widely deployed. Some people have asked about using multicast groups for distribution (but this gets very complex).

Ron Bonica on OPS response to ROAP

The OPS community can't contribute to the ROAP work because the problem keeps changing. Multihoming? rehoming? rehoming without resetting TCP connections? size of FIB (something like /24 limits)?

Dave Ward's suggestion is that the operator community say  what they need from routing protocols to analyze the dynamics of the routing system - so we can figure out what the problem is.
 
The GROW working group does talk about routing trends, who the bad actors are, where is the churn... we're rechartering GROW, and the chairs are working on charter text proposals.
 
Dave Ward says that academic researchers have asked for a formal statement of what to research, so they can request funding. We'd get more help if we did this  - perhaps this list could be part of the GROW recharter discussion. Lars confirmed the  TSV experience that researchers are very interested in this type of guidance - it helps them, too. Perhaps this could be mentioned at NANOG - using a  public list with seeding of topics.

Russ asked what the OPS community is doing to produce BCPs to encourage bad actors? not a thing, operators understand techniques and pollution, and don't care. Telecom Indonesia announces the same route 1.5M times. Writing an RFC won't change this.
 
Dave Ward suggested that default damping changes would be the thing operators can (and would) help us with. That's a BCP GROW could produce. List of things we're working on needs to be lengthened, shortened, prioritized. He wants priorities from operators on problems ROAP is solving (not security, not new service homing, not QoS... - why is the list so short?). ROAP is ignoring RPSEC and SIDR.
 
Dave also pointed out that we've spent a lot of work on security, service separation, etc. that's being ignored here. While the Amsterdam input was "don't reduce security", Dave isn't  sure that's possible  ... :-( but none of this stuff is in current proposed charters
 
Ross pointed out that operators pay closest attention to exposures that already took their networks down...

Jari on ID/loc split

RRG is planning to meet all day Friday at Chicago IETF (some muttering about overlap with Friday morning slots, some muttering about HIPRG overlap).   Mike O'Dell is updating 8+8 and coming to the meeting ...
 
UCLA also doing simulation work. We  need additional work on transport/apps, mobility, lookup delays for applications, scope of the architectural change.

Dave point out that we have to  be careful about saying routing system can't converge in milliseconds, since even interdomain. telephone systems do this today
 
It's really unclear where these discussions should take place. RAM list is chartered for them, but not much discussion is happening there. Not in charter for RRG, not for RAD, how do we get discussion?
 
Lars is concerned that he's not seeing transport considerations in people's minds today - it will be really bad to find out we've made end user experience worse. If routers need to change anyway, we can fix transport for mobility, for example.

Cullen is concerned because we've been blocking topology-aware proposals in SIP. P2P in overlay won't work if you suddenly send media via finland
 
Russ is concerned that ID/LOC path control this is like telling apps to be secure, but they can't tell if IPSEC is running

Dave Thaler on IAB efforts

IAB has heard requests from the community for guidance, IAB is still discussing problem, will have ROAP on IAB retreat agenda next month - that's what can be said about the IAB as a whole
 
A lot of individual stuff is going on, but as individuals. Some people think a statement is important, some think architectural principles are worth publishing (in an update to RFC 1958). Somewhere between  1/3 to 1/2 of IAB is proposing contents - impact of one network on another, core routing benefits best realized by core routing protocols (align costs with nenefits, make sure deployers benefit), core routers have different information about routing variables affecting scalability, power - complexity product, rehkter's law (hasn't been written down recently, with Dave adding "aggregatable"), some notion that native is preferable to encapsulation, but encapsulation is preferable to translation (less overhead is better - Dave Oran working on formulation here), no circular dependencies, no deterministic packet loss, deterministic reordering is currently bad (RFC 2991)

Lars is concerned about deterministic non-congestion loss, especially at the beginning of communication (mark has dealt with this on cisco routers, it's really bad), MSDP has this problem. not only quantitative effects but qualitative effects of loss