ECRIT Interim Meeting ==================================== February 1-2, 2006 Washington, DC: Chairs: Hannes Tschofenig / Marc Linsner Participants: Ted Hardie Brian Rosen Andrew Newton Roger Marshall Marc Linsner Hannes Tschofenig James Polk Jon Peterson Spencer Dawkins Kamran Aquil Tom Taylor Dan Mongrain. Tom Taylor * On Bridge: Barbara Stark Nadine Abbott Tom Taylor * Randy Gellens Patty McCalmont Ron Watro * Tom joined the meeting later. Note Taker: Andrew Newton (also Jabber scribe), Spencer Dawkins Slides can be found at: http://www.ietf-ecrit.org/Interim2006/ Jabber Logs can be found at: http://www.ietf-ecrit.org/Interim2006/Jabber-Log_2006-02-01.html http://www.ietf-ecrit.org/Interim2006/Jabber-Log_2006-02-02.html http://www.xmpp.org/ietf-logs/ecrit@ietf.xmpp.org/2006-02-01.html http://www.xmpp.org/ietf-logs/ecrit@ietf.xmpp.org/2006-02-02.html Agenda bashing -------------------------------- SIP Location Conveyance draft has moved from SIPPING to SIP. We could ask that it moves here, but we aren't chartered for this work now (and SIP is). SIP is really busy... We'll just discuss it for now. Milestones need updating - Brian wants to discuss these. Do Henning's architecture draft before the proposals? We have a fuzzy notion of the model that we are trying to use - Henning has slides on this. We need a high-level understanding - we understand the view from the weeds pretty well. Henning thinks choice of identifiers is the critical subject. Requirements for ECRIT ------------------------------------------------- http://www.ietf.org/internet-drafts/draft-ietf-ecrit-requirements-02.txt Issue tracker: http://www.ietf-ecrit.org:8080/ecrit-req/ Issue 15 Validation of civil location has been hanging around for a while. Should it stay in the 02 draft? Andrew thought the issue was how we would do validation, not whether. Brian wants to leave the language in, but would like to know if people think it's adequate (we're using motivations text to describe what validation means). "A successful mapping will work". We actually do have other requirements that talk about this (doing queries at any time), but we don't describe it anywhere else. Are we actually validating? We don't mean NEMA validation, only asking whether it's enough to get back a PSAP mapping. Passing back "here's the PSAP" would be sufficient. Can we use another word? "Validation" has lots of baggage. Brian wants to make sure that we can get to NEMA validation - "a location that a responder knows how to get to". "A location that can be sucessfully mapped to one or more PSAPs". We're trying to encompass both civil location and geolocation in one sentence - this is one reason why it's cumbersome. If all of the city of Washington goes to one PSAP but the location isn't valid, we still have a problem. MSAG values have the same problem, because they carry street number ranges and a "valid" street number may not exactly exist - and the best MSAG validators today can't guarantee that the address exists. An address can be "valid enough" to get to the right PSAP, even if it's not valid to respond to. Texas has one PSAP entry - so anything would work as long as it's in Texas? No, the current MSAG works on address ranges - but the street exists. Brian would like for this to work on individual addresses, not ranges. Brian thinks that local PSAPs should be determining what happens for various values of "invalid", since calls are going to go somewhere! Ted thinks that there's a lot of variation in what administrations do now and will do in the future. Will the protocol allow us to tell that we got to the right PSAP, or just got a PSAP? Henning - just returning a SIP URI to let someone ask "where the heck are you?" isn't sufficient when we're actually routing an emergency call. We always get a SIP URI, we're trying to figure out what getting a SIP URI actually means for an operational system. "I got a match on all the XML tags except this one" would be an acceptable response, and this would be desirable. This could happen at validation time or at emergency call routing time. Validation may mean "I like the only tags that I care about, and didn't look at the ones I don't care about". Do we agree that we need a result code, in addition to a URI? Partial validation may be the common case (public verifiers won't know suite numbers/room numbers, for example). Is this MUST/SHOULD/MAY? The protocol MUST have the mechanism (whether it gets used or not is up to the administration). And we need to be consistent in our RFC 2119 usage (which is always problematic for requirements documents anyway, since the meaning is a protocol-mechanism meaning, not a protocol-usage meaning). We're making a lot of these requirements MUSTs, and that may make the resulting protocols clumsy. If we use 2119, we should be using MUST, MAY and SHOULD, not just MUST. Tom pointed out that some mechanisms may be provided by extensions, but are not required in the base protocol. Henning says we're not writing a BCP, so we should consider NOT using 2119 language. Randy suggested that we actually spell out "must support", etc. and not rely on 2119. Henning - then we should have consistent terminology, whether it's upper case or lower case. Randy will audit the existing document, but needs input from others. What about returning an indicator of the resolution of the mapping that is returned? Brian suggested text for a mechanism to indicate that a location or part of location does not exist, even if it can be mapped. Should this point out the invalid pieces? Aren't these administration-specific? Tags are roughly self-describing, but we can't pick one hierarchy - what gets returned needs to be free-form. James - doesn't the text say "location should be sent even if it's considered unreliable?" Is this OK? We will leave L01 language as is, change "location validation" in terminology, and we are adding two new requirements to this section. Issue 17 Does the ECRIT protocol need to provide a way to fix data that's wrong in the database? No, but is it helpful to provide a URL for a mechanism that would help in fixing the data? NEMA did this - it's actually helpful. We can't tell whether the user is wrong or the database is wrong. The URL may even be a web page that says, "fax your correction to our office". Validation will happen at phone boot time, at the latest. The access network will be the one that actually notices there's an error - what does the access network OSS do then? The actual way something gets fixed is very useage-specific. Is this requirement in the location section and in the mapping section? Ted - IESG would probably want to know what kinds of URIs that get returned (in the protocol specifications, not in the requirements document). Brian thinks limiting to HTTP would be fine, because a human is going to have to to resolve the problem anyway. Brian also thinks the protocol needs to control the information in the database. Andrew thinks this is OK but out of scope for the working group - this isn't the provisioning protocol that we're working on now. Brian thinks we'll get stuck doing the provisioning protocol, too. We will add a new requirement with Brian's wording. Issue 24 "Ideally no identity information is provided to the mapping function." Jon has some concerns about this issue, because SIP is actually working on Identity now. Brian - well, assume that we would like for locations to be signed, so they are harder to forge. This isn't about the identity of the presentity, it's about the identity of the location. Jon - if someone queries presence information, they need to know what they are asking for, and they may be asking for a specific person. Brian - we're getting wrapped around the axle here - do we need to strip identity before we submit a location for mapping? this would mean that we couldn't send the signed PIDF-LO (actually, we just couldn't verify the signature). Marc - why does the mapping server care about the source of the data? Ted - is this issue resolved if we have an unlinked pseudonym, or even if we don't require a signed PIDF-LO? "Must not require an identity to perform mapping"? Brian thinks this is OK. Jon - we're doing requirements for the protocol, so need some rewording here. Will be MUST instead of MAY. Henning - if we require PIDF-LO, we've already ruled out two or three protocols. Jon - this came from location-by-reference - this isn't the only way to do location mapping. Henning - If we rule location-by-reference out, that has some impacts on protocol selections. Having the mapping server do anything except mapping is completely out of scope. Marc - Can we require location-by-value? Andrew - also say "mapping function doesn't deference" - that would help, because the mapping function may be part of the trust relationship. Ted - is the protocol required to cut out the ? It has to cut out the PIDF-LO anyway, so this is just discarding the integrity protection that we don't need anyway. Henning - does anyone think that sending signed PIDF-LO is valuable? If it's not, we shouldn't penalize protocols that can't do this. Jon - if we're doing this by value, people will just be cutting out the PIDF-LO. Henning - can we make this decision? If mapping entities can dereference PIDF-LOs, I need to redesign my protocol! "The mapping protocol is not required to support the ability to dereference locations"? "Should not"? Marc - point is that we don't want to say that any mapping function can dereference location (since I don't want to tell people wherever I go, just because I might call 911 someday). Jon - there's not just one rule maker - might be enterprises, telecom service providers, etc. Ted - you can always pass your location to the mapping function - this is when someone else passes the location to the mapping function. Henning - "MUST support location-by-value, MAY support location-by-reference". Jon - why is a proxy different from a mapping function? Marc - presumably there is a contract relationship with the proxy. Nadine - there may be legal obligations for mapping service to dereference. Andrew - if you can't assume a relationship with the proxy, relationships with a national carrier are even less likely to happen. Nadine - isn't location-by-reference a wireless concept? Fewer bytes, don't have to keep it up to date before someone tries to use it... Henning - either we have to figure out how location-by-reference works for all the protocols, or we go with the "don't need to support location-by-reference" language we discussed. Ted - but saying "we don't do location-by-reference" excludes some things that James cares about. "Is not one of the evaluation criteria"? Two new requirements are added to resolve this issue. Issue 26 Must be able to discover the local emergency dial string - the thing that's missing from the issue is "home versus visited". Could have different mechanism for home and for visited (provision home dial string, for example). ID6 says "configured automatically without user intervention". Is this the same requirement? It's about dial strings, not about identifiers. Australia has some really strange types of emergencies - how to support them? By IANA, like you'd expect. This provides extensibility, but the problem is that a new area may have new types that the device has never seen before. "Unofficially imported" - the VSP isn't aware that the device has moved. A specific jurisdiction can say "this service isn't supported", or provide some other service that seems appropriate. Brian - we need to split dial strings from service identifiers - the room agrees. "The SIP UA should translated visited emergency dial strings to the universal emergency identifier"? We like this one. Need to be clear that "SIP UA translations" have nothing to do with the mapping protocol - this isn't clear when we read the document today. We are punting the "visited emergency dialstrings colliding with dialplans" discussion to a BCP, and out of the requirements document. Is it sufficient to say that service requirements are extensible? Yes, the question of what to do when someone requests a mapping for a service type that's not available isn't a protocol requirement, it's an operational decision. Brian would prefer that IANA registration be used, but this isn't for the protocol requirements document. What do we do when someone dials 911 in Switzerland? Maybe Switzerland needs to make a decision to map the call someplace, but it would be OK for them to provide reorder as an illegal dialstring - Brian points out that we can't require anyone to provide a single destination for all calls. The document doesn't say there is no universal dialstring (should it?). The use of a service type is locally determined. Issue 30 We've been through a lot of text on this one, and have resolved it on the list. Issue 25 "Launching an LCMS query prior to making an emergency call MUST be allowed". Henning - need to make sure that the mapping returned doesn't have to be the PSAP URI. We will remove Ma12, but we still need to talk about handling test calls - we care about this. There may be a SIP mechanism (for example, with Priority header) that we can reference in the BCP document (that we are supposed to be writing). We need to be able to do end-to-end tests, but this is architecture-dependent, and isn't in the current charter of the group (along with a lot of other things that are needed but not intrinsic to mapping, like emergency identifiers). Jon believes we should make progress on what's currently on the charter before we try to take more work on. Andrew thinks that adding requirements that aren't accommodated in the current charter is very dangerous. Jon asked about identifying "future requirements", maybe in a separate document - our current "future requirements" seem to be floating in the ether... What's involved here? Mostly having a .test URI, according to Brian. Jon says we need a document that describes the big picture before we know how to run a test against the big picture - we will get there, but we don't have a big picture yet. The thought process was that we were starting with small achievable steps - this isn't a small step. Brian - designing pieces is harder because we don't know what the puzzle looks like that the pieces fit into. Ted - but designing a few pieces removes lots of degrees of freedom, and makes the big picture easier to design. Henning - we don't seem to want to put this in the current document - let's move on. Randy - Henning pointed out that a test passing now doesn't mean that it will pass in ten minutes - but a test failing now does mean it's likely to fail in ten minutes. What does the test URL look like? Henning - either a SIP header or a URI parameter, most likely. Hannes - people who are building solutions have architectures in mind now. It's hard to ensure that people are happy with what we produce, and we're already behind schedule - don't want to add work that makes what we're supposed to be working on take longer. Consensus of room - we're going to leave testability out for now (was I12 in first draft), will remove MA12. Issue 9 "Dynamic versus static routing" - do we want to drop this one? There wasn't any discussion around the issue after it was raised. Is this related to pre-call routing? It's directly related to freshness. James' PSAP hasn't changed since he moved to Colleyville several years ago - enterprises and other businesses might have this. Make 911 calls without doing mapping? May not be required for every call. This issue was raised before pre-loading a pre-PSAP URI was mentioned - maybe it can be dropped? Henning - if requirement doesn't say that mapping is required, we're fine (and the issue can be marked "resolved"). Issue 10 "Design of non-emergency components" - wording was suggested on the mailing list. Trying to avoid forcing changes to existing data to match a mapping system - for example, non-official but well-known names for streets and community names. Brian says we have two different elements to meet this desire in existing systems - but this means we need translations into something the mapping service needs. Issue is, "does database specifically support aliases"? This could be a protocol issue if the response says "you should be using this other name", for instance. This is actually a new requirement - for returning more than one street name, community name, etc. when aliases are present. Andrew will send text to Roger on both multiple community names and aliases. Do we know a constrained set of elements? Probably not - we might be able to exclude some, but not in the general case. Issue 8 "Multiple community identifiers" - resolved. Issue 14 "Additional locations" - DHCP information, etc. - resolved. Issue Summary We have worked on all the open issues - are we ready for WGLC? Need a new revision, but please don't spend another five months fixing MUST/MAY/SHOULDs! Consensus of the room is that the document is very close to WGLC. Emergency Identifier ----------------------------------------- A Uniform Resource Name (URN) for Services http://www.ietf.org/internet-drafts/draft-schulzrinne-sipping-service-01.txt Emergency Services URI for the Session Initiation Protocol http://www.ietf.org/internet-drafts/draft-ietf-sipping-sos-01.txt Henning is working with a distinction between UA recognition and UA non-recognition. Does Brian agree? He's following, keep going. Brian - UA recognition with proxy resolution? Henning - In proxy location determination case, how does proxy insert location redirect? Proxy recognition and proxy resolution case - UA doesn't know that this call is an emergency call at all. INVITE and To: say sip:911. If you send a 425 response back in any call, you can find the caller's location (because the UA doesn't know it's an emergency call). Brian - but the problem is that none of the UAs know about SOS today... Concern is that this gets us back into end-to-middle security, and we don't want to go there in an emergency call. Barbara - assume that any device that knows its location for emergency calls will know about the emergency URN. What if phones downloaded dial plans every time they boot? Barbara - concern is that legacy endpoints don't know location or dialstring - emergency calls still need to work. James - if you 425 and things don't improve, route to the default PSAP. But this is getting into cross-transaction statefulness, and that's really ugly if we can avoid it. Henning - do we actually need to support Proxy Recognition and Proxy Resolution? Brian would be OK with "no", James still needs some thought on this. Henning - do we need to worry about old phones? Or worry about new phones that are broken and don't know about emergency services? If the mapping mechanism becomes a B2BUA, we could handle this. Jon - we had the same discussions about SIP Identity, and the mechanism was dumb there, too. One problem is that it works much better for requests than responses - will we ever care about responder location? James - Location conveyance assumes a new header. Can we assume that we don't need the 425 case at all ("not a citer")? Henning - does not add value, does not accommodate legacy devices and is dangerous. Can we use a 400-series response that says "this is an emergency call, do you really mean this"? Jon - if all the smarts are in the proxy, the device is too stupid to do anything at all. Spencer - use a 1xx code to say "this is proceeding as an emergency call"? Henning - at least this is less dangerous. Brian - what do we think should happen on an upgraded device that doesn't know its own location and doesn't know this is an emergency call? Got to route this to the default PSAP - it has to go somewhere. Henning - if you do know your location, and you don't know you're making an emergency call, is sending this to the default PSAP good enough? Ted - designing around corner cases gives you really complex protocols. Brian - these discussions should be captured somewhere - probably in the architecture document we can't start yet :-) Is the SOS draft ready for WGLC? Brian has a concern that we don't do anything with the SIP To field (that's the SIP model), but isn't seeing that we are doing this - so he's OK. Jon said that people do assume that To is the original dialed number, but this isn't consistent in the community, and 911 would work for the (poorly-behaved) applications where this is the case. Henning - if you route beyond the PSAP to the first responder, you might use the To service URI to tell fires from animal controls calls. Brian - if you dialed sos:police, the PSAP will still try to vet this, so everything is still OK. If the UA recognizes the call, it uses URNs, if it doesn't, To: will contain 911. Options for Emergency Identifiers ------------------------------------------------------ We've eliminated a bunch of options - SIP (sip:sos) user and service URN are the only alternatives still being seriously considered. Both remaining alternatives have drawbacks; the service URN seems the cleanest but requires more UAC changes. SIP user is backward-compatible but violates proxy behavior rules. Should we do both? Brian - if anyone changes a line of code, they can add support for URNs too (so, "no"). Hannes - consensus to do two mechanisms at the last IETF meeting, but it wasn't that clear. Ted (as participant, after we tried to punt to the ADs) - need to be clear between URI service: scheme and URN, but agree with URN approach. Should service URN draft become a working group item? We could be forwarding to the IESG in a month, so "yes". We do need a BCP on how phones treat emergency calls, but it may not be in Ecrit. Need to describe what phones do and what first-proxies do, but this can be in the same BCP or in two separate ones. Security Threats and Requirements for Emergency Calling ---------------------------------------------------------------------------------------------------------------- http://www.ietf.org/internet-drafts/draft-taylor-ecrit-security-threats-01.txt Version 01 is a rewrite from scratch after Stephen Kent's comments. Is "Emergency Caller's Device" definition OK? Also added "Configuration Time" concept. Brian - obviously we have to rationalize terminology with the requirements draft (although Tom is trying to use Roger's terms, they are still moving around a bit). Have added H.323 - was anyone thinking about this? Brian - why do we care? Because H.323 is IP-based? But so does lots of stuff... Is this solving the mapping problem for other protocols? Ted - we return URIs by scheme, so if you have a scheme (H.323 does), you can get a URI if the provider cares to support it. All you need is a URI scheme. The architecture is taken from the requirements document, with some additional stuff relevant to security. There's a list of tasks required to successfully complete a call. Confidentiality is required when we say "there's an emergency at this location", or "there's an emergency involving this person". We also have to watch out for personal information in location-by-references. The working assumption is that it's interesting when there's a location query for a location ("must be something going on"). Testing dilutes this somewhat, but seeing six lookups in 30 seconds is probably still interesting. Jon - if the security threats document is the only place that defines the big picture, that's not good ... but Stephen said we needed the big picture to do the threats analysis, so ... We do agree that removing the countermeasures is appropriate (this was also a Stephen Kent comment). Tom - we do need to describe the system, in all its variants, to do the threats analysis. Andrew - but the end-to-end call flow is bigger than the entire scope of the working group - what we have is a lot bigger than just the mapping function. Change the document title to "security threats for the mapping protocol"? That would give us different comments. Hannes - but getting the mapping protocol secured doesn't help if someone can attack the input data. Jon - but this goes back to the early decisions about how much work we can do at one time... Hannes - we've had multiple versions of this documents with big changes between versions. Specific requirements may sound good for threats but may not be there for an emergency call. Tom - introductory text may have overreached. Can we look at the document and decide what's in and what's out? But the motivations should still stand. Henning - need to think about what a carrier does differently because a call is an emergency call (threat is marking a call as emergency to bypass charging). Brian - how much do we need to say about this? Just "will treat a call differently"? Henning - need to think about what the threat actually is. What are we trying to prevent? A proxy may make sure that the call is still going to a PSAP URI, for example - carrier needs to stop the call if they care (if the endpoints are colluding, no one else will stop the call). Hannes - this is like taking the SIM out of your mobile phone - you go to a PSAP, but you can only go to the PSAP. The document contains various classes of threats - to the device, to the proxy, to location aquisition, to mapping, for information about an emergency, for fraud, to the emergency response system... Hannes - is this document going the right direction with the current approach? Jon - does the working group think we need to include all of these threats in this document? Tom - make this an AD-sponsored individual draft for the overall security issue? Jon - already chartered to do this work - narrow the focus? We're talking about the security of a DNS lookup and trying to put it in the context of an entire web transaction (this is an analogy). Andrew - we keep coming to the question about architecture. Brian - we're going to have a hard time getting good reviews if we don't say anything about the overall architecture. Nadene - there are some things in this document that don't fit nicely into a bucket. It's nice to have an overall picture, but not sure what to do now. Henning - problem is that we are reluctant to develop a big picture for some reason. Jon - there are some things we've defined, and some that we haven't defined. Having an overall architectural picture would be nice, but I'm worried about derailing the work. Brian - we actually do have a common understanding of the architecture, we just can't write it down for anyone else yet... Tom - call signaling has been worked over repeatedly, what I say is trivial, I should be incorporating this stuff by reference anyway. Henning - but that argues that we should be writing the architecture down - would be one page plus a diagram. Jon - but we live on Internet Drafts, and it takes us a long time for the text we've got now. No telling how long it will take to develop the text you're talking about. Henning - maybe you're too pessimistic and I'm too optimistic. Ted - can we assume there is a mapping protocol, do that work, and then do the experiment of getting the architecture out. We're way behind - this was supposed to be the fastest working group on record... Tom - strip down to the mapping protocol only? Andrew - can we start on 6.3 and 6.5? Tom - how do we find a mapping service? Andrew - that's out of scope (laughter). Brian - we want the protocol to support these mechanisms, and we want call routing to work if the mechanisms don't work. Tom - am I being too specific by asking for authentication? Jon - you have an attack to prevent an individual from receiving aid by setting up a fake mapping service. Some confusion about "replay attacks" - this is about replaying requests, but probably not replaying responses (at least in this case). Tom - should I issue a document a document that strips out everything but mapping? Jon - restrict the scope to the chartered work, at least for now. The other work will almost certainly be chartered when we deliver anything. Henning - but we're trying to reverse-engineer the architecture piece by piece, this is silly! Spencer - can we limit the scope of solicited reviews to the corresponding requirements, or will reviewers filter the documents through some perceived architecture in the absence of a stated architecture? Jon - this is like ENUM - there are a lot of peering architectures, but ENUM itself is pretty easy to specify. There are some very controversial parts to this. I could be argued into adding some stuff, but don't want to take on the controversial parts. What about being a source of flooding attacks? This is more than just amplification. We need to make sure that we're not worse than existing opportunities for flooding attacks. Henning - we need to be very clear. We've said "no reflection attacks", we've said the protocol must defend itself, we've said "resilient against DOS attacks" (which I don't understand). Tom - there is some material that's imprecise - I understand what SCTP does to protect itself, but there's probably more going on. Jon - we have DOS and eavesdropping listed. Is there more that we need to look at? Brian - what about an impersonation attack? but this isn't a mapping protocol attack. Jon - getting preferential treatment for myself in a larger emergency? Andrew - we have some really great motherhood MUSTs in this text that will turn into its own DOS attack if a provider is using all of them. Brian - the problem is that usually the motherhood stuff results in rejection, but we never reject emergency calls. Andrew - "protocols MUST allow providers to turn off this stuff!" Tom - we're doing this security analysis knowing that these attacks will be rare. Ted - but there's no difference between slash-dot and a DOS attack to a web server - if an emergency affects a large number of people, the same thing will happen. Spencer - if 40,000 people go off hook on 9/11, the system needs to still work. Brian - and they crater today - that's the goal for the next generation systems. Hannes - we've chopped the document to shreds, but we do know what to do next. Jon - this document will start to grow again, when we get more details nailed down. Hannes - can we do a strawman proposal in the next few days? Please give feedback before the next IETF meeting! The problem is that we need to support running with virtually no protection, support more protection, and prevent biddown attacks. This is actually normal (defense communications calls this "dynamic security policy"!). 3) Mapping Protocols ---------------------------------- LUMP -------- http://www.ietf.org/internet-drafts/draft-schulzrinne-ecrit-lump-01.txt There are multiple independent entities that provide mapping information, and they don't have any hiearchy. They may have completely different policies. They contain authoritative data. The entity can be a single server, or a group of "trees" that don't even trust each other. Henning is proposing "tree guides" that propagate using a broadcast/gossip mechanism. The resolver can contact any tree guide and be directed to the right tree. This table is likely small (hundreds of entities) and stable (root changes every few months), so replication is possible without undue burden. Protocol requirement is to support redirection to the right tree. Brian - what if you have two trees that say (for instance) they are .uk? Andrew - this isn't the same problem as DNS - relationships are required. Henning - ICANN decides who within country X gets to represent country X. Worst we can get is fragmentation, and if we get this, we probably can't even decide who is running a root. Brian - you've turned one problem into 300-squared problems - you need bilateral agreements with everyone else. BGP isn't a good example, because someone else is handing out all the addresses. What if the US says "I'm the US" and New Jersey says "except for New Jersey". Andrew - we've figured out how to roam into Taiwan, but it happens because carriers hardcode special handling. Visualize Israel and Palestine, or Kashmir - location will get ugly fast! Brian - every disputed area has two polygons that cover it. Henning - this is different from what happens in DNS. Brian - if we allow multiple representers, we can't store this information in DNS at all - we need to be very clear on this decision. Ted - it's not the DNS, it's the delegation mechanism - people either bludgeon each other in policy bodies or do n-squared. Is there a mechanism that allows clean delegation for two polygons for the same location? Every time we try to reuse an existing mechanism, we get baggage we don't need. Brian - we can do anarchy, and everyone can figure out who to trust. It's not right to adopt a model that excludes specific protocol proposals that are on the table now. Henning - I can only extrapolate from experience, and I'm extrapolating from ENUM experience, which was not happy. I think the approach will work - may not be pretty, but is manageable. Brian - the only advantage I have in my protocol is that it works with 5-9s reliability on deployed code. If we have to support a distributed root, we are now inventing. For emergency calls, we need to be biased in the direction of stuff that works. This is big infrastructure, it's distributed all over the place, we need lots of confidence that it will work. Seems like we could make LUMP work, but don't know how long or how many crashes it will take to get things shaken out. Visualize someone who floods bad data and no emergency call in the United States works until it ages out. Andrew - but DNS isn't perfect - one Windows resolver bug did a DOS attack that took out 11 of 13 roots. Ted - we do have to know two things for any mechanism we spec - that we know the right people can put the right data out, and how people discover who to talk to. We may have transitive mechanisms (a caching resolver that talks to authoritative servers), and that's OK. We're going to have to look at specific protocols to know this, we can't just extrapolate from other protocols. We can't start with a bias in either direction. Henning - LUMP could all be run by ICANN, too - we just don't know who's going to end up running this stuff yet, and need to develop mechanisms that accommodate being run by different people. ENUM hasn't gone anywhere in half a decade and we're going to private peering. Andrew - this isn't a LUMP discussion - we could stick LDAP in here and have the same discussion. Requiring a single root just doesn't work out that well in the real world. Ted - we can start out with multiple roots and accommodate a single hierarchy, but it's harder to go the other way! Mapping Protocol Design Aspects ---------------------------------------- Front-end and back-end, caching, and polygon size. Henning talked yesterday about how we replicate polygon information between servers. Is it necessary to use the same protocol for mapping server front-ends and back-ends? Henning - are we OK with having a first-hop protocol that can't work anywhere else, or are we biased against this? We probably can't answer this question without looking at specific candidates. Brian - there's an architectural decision that we haven't made yet - whether the endpoints will be doing mapping or not. Usually, they aren't doing the mapping today. Can we cache PSAP boundaries to short-cut the query process? Hannes showed a picture of PSAP boundaries for Austria - how large is a typical polygon? 2400 communities (with own fire brigades), 42 police service areas. Henning - in US, there's a census database of county boundaries - about 5000 - so we're order-of-magnitude the same as Austria. Municipalities also provide PSAP services. Brian - this is back to an architectural discussion, not a protocol discussion. If first-hop proxy does mapping at call time (and at endpoint boot time, that the endpoint can cache in case of call-time mapping failure) ... Henning - we can speculate merrily about what different jurisdictions will do, but we have to support several alternatives. Endpoint mapping is not a big deal, thirty lines of code, one SQL statement, not rocket science, using a 30-year-old altorithm. If we're going to make progress in the working group, we need to support endpoint mapping. We also need to recognize that different people will make different tradeoff decisions about caching. We have no scientific way to know what will be more common now, much less in 30 years. Brian - shouldn't all this discussion be out of scope? Henning - at least some things are known - number of service boundaries as an order of magnitude, for example. We don't have anything operational yet. We can extrapolate from HTTP/LDAP/DNS that people make different caching decisions. We should recognize that caching lifetimes are probably important to include. Brian - Henning and I have fundamental disagreements on caching, but I agree with the other stuff he is saying about polygon sizes, etc. We can't fail to support real-world scenarios as out of scope. It would make a difference in knowing the sweet spots, when we evaluate protocols. Henning - we can speculate about initial deployments, but this protocol will last for at least a generation. We're in the middle of the NGN-versus-Internet discussion, and we can't pick a winner. If we end up with a sealed-stupid-endpoint model, we go one way, if we end up with Skype, we go another way. I have a desired outcome, we all do, but no one cares what we think. Andrew - if we assume that proxies will always do the mapping, things get simpler, but I expect endpoints will do mappings. Ted - we have agreement in the working group that any entity can request the mapping, whether as a testing thing, as a failsafe thing ... Nadene - answer may be different between wireless and wireline devices. The answer may be different between geo location and civic location. Hannes - we're just saying that the query doesn't have to be repeated if the answer isn't going to change. Ted - Nadene's point is that civic and geo location will have different caching characteristics. Henning - of course we need to keep civic location in mind when we discuss caching. We may be able to cache a subset that we know is in the same service area. We need to think about likely caching lifetimes (how fast can appropriate PSAPs change?). We all agree that we don't do mapping from geo locations to civic locations (at least, Henning and Brian agree). Ted - "this PSAP URI is good for anywhere in Texas. If you're still in Texas, the answer hasn't changed, so you can use a cached value". We don't want to figure out the matching, we want to provide the information that someone can use to do the mapping. If you're willing to make the mapping server more complex, so it returns the largest circle within a polygon, that is a requirement for the mapping server - we need to make a decision about mapping servers doing synthesis vs working off stored data where no synthesis is required. Henning - a single web page these days is larger than a polygon. Brian - the only reason for using cached data is because you can't get live data - if you always use life data, the sophistication of caching isn't important (if it's only a backup). Ted - this is "use fresh data in preference to cache data" - there are ways to know that data is "fresh". Brian - I've never seen a disaster that works the way we expected. Saying, "no matter what, this data will be valid until X". Don't design the system to make this assumption. Ted - if you try the URI and it doesn't work, remapping is logical and reasonable, but you need to look at deployments where someone can provide a caching lifetime. Andrew - but having a lot of lookups happen at once is a bad thing. Ted - if I can start a call before I get a response back, the setup time goes down. I've got TCP, I've got TLS to do before I can route a call, and being so slow that people abandon calls is the wrong answer. We start with the best data, and then try to recover quickly. Hennning - this is the wrong answer - being optimistic gives a better experience. If you do validation (when you move), then you have a PSAP already. If a PSAP goes away (due to Katrina), you do the remapping at that time. Marc - statistically, PSAPs don't change because they are technically strapped - we shouldn't make this assumption going forward. Henning - but PSAPs don't change randomly. Marc - but they are most likely to change during a disaster. Henning - reactive large-scale call routing and loadbalancing will happen at proxies, not during mapping, because proxies know more. Marc - always use the freshest data, adjust the time scales. Spencer - Ted had two alternative outcomes - that a cached call fails, or that it goes to the wrong PSAP. Is going to the wrong PSAP OK? Ted - I think it is OK. You would still call the old URI if the lookup fails, even if you know it's not fresh. Dan - 911 network knows to go to the alternative PSAP when a PSAP fails today. When I get a URI for a PSAP, and it's not responding, what's my next move? Henning - plan for the common case. We can return a different mapping, return a different DNS entry, rely on DNS roundrobin, or proxy-based routing. Brian - we believe that state-level routing proxies will exist. Dan - but PSAPs don't like to cooperate. Hannes - high availability issues do exist but aren't tied to the mapping protocol. Henning - we need to design protocols that support a variety of operational deployment scenarios. We don't know enough about what will happen for this knowledge to guide our design decisions. Brian - design for 99-percent. There's usually six calls per second in the entire United States. All of the protocols that we are looking at will work 99 percent of the time. It's only during disasters that the proposals diverge. Ted - wireless roundtrips are going into a budget that has a lot of other stuff that's already in the budget. Dropped packets will cause real problems really quickly. You don't get a budget of a second, because so many other things have to happen. You really do need to optimize for this - your budget for the mapping could be zero most of the time. If a trip is even 200 ms, it will likely trip most of the time. We don't agree whether the timeout will happen in the normal case or not, so we're not able to make this tradeoff decision in this group. Marc - we took 40 seconds to set up calls in 9-11 - that's too long. If the normal case is 10 seconds, people will wait 11 seconds. Ted - we want to shave long setups every place that we can. Henning - we aren't making progress. We've identified the difference between two proposals - wait 200 ms and then use old data, or use old data and wait 200 ms. Either way is a reasonable tradeoff. This won't affect our choice of mapping protocols. We don't need to solve this problem now. Ted - we presume whatever device is doing the mapping is keeping a copy of the mapping, whether it is used optimistically or not. Brian - "the protocol must support this", but we can't mandate this operation. Caching means either the mapping or the boundary for the mapping - I'm OK with this. Ted - and we can include TTLs for information. Brian - I agree. Ted - if you always look first for fresh data, you'll wait a long time. If you start the call and look for fresh data if it fails, we can still use the same protocols no matter where the trip point is. Brian - as soon as you have a 911 call, all implementations will immediately try to get fresh data. What they are doing while they are waiting can vary. Dan - TTLs can vary a lot. Brian - but you should refresh mappings when they expire, and TTLs are under administration control. PSAP mappings can be revalidated based on TTLs. They may also be revalidated when somebody thinks the answer has changed. Henning - there's an opportunity for good network management, but we don't need two different mechanisms to accommodate refreshing because of someone moving and refreshing when TTLs expire. Nadene - are you cashing a referral to a URI, or caching additional data? Brian - additional data, you might get polygons, you might get "your mapping is also good anywhere in Texas", etc. Marc - maybe a validation attribute and mapping attribute (but we're talking about solutions). Henning - also may provide information filled in by mapping function ("you didn't provide a county, but you're in this county, if you care"). Brian - we have various proposed protocols that work in different ways here, but we don't have requirements for any of this yet. Ted - think we can combine two proposals pretty quickly, and then compare DNS to XML. Barbara - merging two proposals to get one proposal with lots of options is not good for implementors. Henning - agree that this adds complexity for minor gains and more difficult transitions. If we have more than two options in the combined proposal, throw tomatoes. Ted - Brian's argument (if I got it correctly) is that we need to make design choices that we know will scale and will accommodate the distribution mechanisms we need - I agree - and that DNS is the way to accomplish this - and I disagree, because I think how this proposal uses DNS is sufficiently different from normal uses that we can't extrapolate from normal uses. Henning - we have implemented and are running a DNS version that works fairly well for simple civic addresses in the US. We didn't have to modify code. When we moved to geo locations, things went wrong quickly. We're bouncing back and forth between DNS and a mechanism that gives you polygons, between normal queries and NAPTR queries. We had to add enough code to do this that we could have added code for IRIS. We're not actually discussing the DNS proposal, we're trying to figure out if we can reject it quickly :-) Brian - with DNS, you get the DNS advantages for civic and no disadvantages for geo locations. Henning - not what I thought, based on my students' implementation experiences. Brian - don't focus on the client complexity, focus on the infrastructure fragility. Ted - if you need to alter the structure of the data, extension mechanisms will fail - I'm concerned about this. Andrew - don't adopt a mechanism just because stupid people can implement it - I deal with interns and grad students, too, and some of them aren't the people you want writing a 911 system. Henning - this isn't a black and white decision. Barbara - Like DNS, but don't think that SOAP/XML is an excessive burden for developers. Don't see the advantage that IRIS brings. Roger - we don't actually know much about how people address streets everywhere in the world. Jon - this is a lot like ENUM, and ENUM proposals run into problems with query structure inflexibility all the time. I see the same problem with the DNS proposal. We end up constraining the hierarchy and entire operational model - this is the death knell for extensibility. Brian - the problem with all the other mechanisms is that they don't have a delegation mechanism at all, we don't know what they might look like. Jon - but we've done delegation models before. IRIS lets you add all kinds of additional information - don't understand how we can integrate this with DNS. Henning - once we had sufficient but incomplete information, we had to perform multiple queries (ex: missing county information). Also concerned about validation. We've talking about validation depth and caching depth, and couldn't get past binary results in DNS (maybe providing a help URI, but that's it). Brian - but a help URI would actually work pretty well, and could be supported at every level. Geo locations don't fit very well - I agree with the concerns here. Jon - I think even civic locations are going to be synthesized. Traditional DNS relies on static shared zone files. Synthesis makes servers do more work, and this makes DNS servers less scalable and less reliable. These are the kinds of concerns Ted has. Brian - but the parts of the tree that are covered by a PSAP could work exactly the way DNS normally works. Why not? Hannes - we have lots of parts of the world that aren't participating in this work now. We encountered street sections in Taiwan that we'd never though about in GeoPriv before. How do resolvers in clients find out about new attributes? Jon - How do we know how to handle new fields? Hannes - what happens when I get off the plane in Taiwan and turn on my form? Brian - but Taiwan provides the location as A1 to A9, and if street sections are A6, you'll handle street sections correctly. Henning - hierarchies really aren't the same across countries- states are different, countries are different. Stuff like alleys may not fit nicely into the A hiearchy. We can't assume that every new way of addressing will fit nicely into the A hierarchy. Brian - we could have an entire state's civil information stored - that would be big, but not huge. I think that even for geo locations, we would have something that probably works, and I don't think that any other proposal would have this characteristic. Ted - does this use the normal DNS caching infrastructure? How many round trips does it take to get a usable response? And where are the round trips between? Andrew - one problem I'm having is that you're not using normal DNS terminology - endpoints don't usually have caches. Ted - resolver algorithms may not base caching decisions based on TTL - can be based on current load, etc. Ted is also concerned that 3GPP/3GPP2 using Mobile IP would be using home DNS resolvers, not visited DNS servers. Andrew - TTL caching not nearly as predictive as you think. Caches get used for other uses, caches don't respect all TTLs (not 0, not over one week, at a minimum). Ted - some people cache for an hour, at a minimum. Cache misses are also not determinate - probably miss on street numbers, probably don't miss on state names - it's somewhere in the middle. Ted is also concerned about multiple unreliable round trips, with long timeouts - we may lose flexibility and not get reliability. Henning showing (five-year-old) results from measurements with 20 percent of DNS queries taking more than one second... Brian - I'm willing to face reality and would prefer to face reality sooner if it's ugly. Am I headed the wrong direction? We have two proposals - merged ECON/LUMP and DNS. Should we call the question for sense of the room? Brian - should be about DNS, not about a comparison. OK, we hummed, and got one hum interested in DNS. Spencer - we need to make sure we ask the same questions about any other mechanism (not trying to reopen this topic). TCP based mechanism would have TCP characteristics when faced with loss, retry isn't instantaneous here, either, for example. Need to use the same discussion as a filter. Advantages - 24x7x365 uptime (but this is for DNS, not for an individual server). Need to avoid relying on one server, need good results when there's a failure. Works anyway with massive chunks of infrastructure offline. Not saying every access network has great DNS setup, although things have improved (we give clients two different servers for this reason). Should we have the access networks hold a cache? That was a nice thing about DNS. Need to think about what happens when various things die, and what our defense is. Multiple servers? Broadcast? Discovery was nice, and well-understood. Uniform and known delegation mechanism, including relationship to authoritative servers for referrals, also needed. Disadvantages were timeout predictability and caching predictability. Also were sharing infrastructure with other applications and losing some predictability there - as little fate-sharing as possible (but no littler). Need to pick mandatory-to-implement transport bindings, and other valuable transport bindings. Need to look at bigger architecture picture - do we roughly think Henning's picture yesterday was OK? DNS-SOS: ----------------- http://www.ietf.org/internet-drafts/draft-rosen-dns-sos-03.txt (This was mostly discussed in general, not on specifics) ECON ---------- http://www.ietf.org/internet-drafts/draft-hardie-ecrit-iris-03.txt Andrew showed some slides on the current ECON draft. Henning - problem is that we don't want to reinvent TCP, any timer values will be wrong, and it's likely that people will make them configurable, and that's worse. Other problem is that TCP is a stream protocol - socket interface is subtly different. Ted - also need to think about security. DTLS would work, but you wouldn't want to use it here. Henning - specifying a simple and lame transport in SIP was a mistake - no one implemented TCP, Cisco still hasn't, and we've got a really bad security story on SIP - much worse than just having TCP unencrypted. TCP adds three-way handshakes at setup and teardown, and TLS gives us a lot more round-trips. TLS will be in any spec we come up with, but will be MUST-implement and not required for use. Jon - would love lighter-weight security mechanisms. Digest? Henning - issue is privacy, not authorization, etc. Jon - Digest AUTH? Middle ground is better. Henning - XML DSIG? no additional roundtrip times. In light testing, UDP 46x faster than TLS, 14x faster than TCP. Henning - SIP problem is request/response size divergence - polygon doesn't fit into one packet, or even two. What do we do then? Ted - we should return the mapping and an indication of the polygon (fire up TCP is ok if you care). We don't want to run into the same problem as SIP (require SOAP, not UDP, or everyone implements UDP and stops). Henning - need to be able transport reasonably-sized polygon. UDP has 64 KB limit, and many implementations fail at 8KB. How to back off in the face of loss? If UDP fails, try with TCP - don't try to do exponential backoff yourself, etc. Henning - there are two reasons UDP fails, one is packet loss and one is NAT/Firewalls. Cycling through all the UDP entries that all fail isn't reasonable. The only time people care about timing is during an emergency anyway. The only thing we think we know is that HTTP gets through firewalls (and even this isn't guaranteed). We can probably figure out that a NAT is blocking UDP at boot time, but no clue whether UDP or TCP is more likely to pass through any reasonable Internet path. Can't even count on UDP because we're using SIP, because SIP/TCP may be more common in the future. Spencer ranted for a while about UDP - we can say that having a single woman on a dark road at night should prefer TLS, but during a disaster, getting one response packet from one request packet would be lovely. Marc - is a PIDF-LO with no identity a privacy problem? Brian thinks privacy goes away WRT the PSAP, but not WRT ambulance-chasers - but punting to unencrypted UDP as a last-ditch is still the right thing to do. Do we use the same mechanism during an emergency as during testing? Marc - the definition of privacy shifts constantly and this isn't helping. Henning - maybe we agree that event privacy is the issue during an emergency? People run non-encrypted protocols all the time, at least revealing presence information. Privacy isn't an absolute. Another conflict - does service provider control what they support (to control the load on their services), or do people control what get used (to control privacy)? We have too many tradeoffs! Henning - we should be doing a complete solution - do it like DNS, not the way we did LDAP. Ted - knowing what a DNS zone file format look like helped, even before we had protocols to exchange them. Shipping the data structures around may make sense (civil more than geo, but we should still consider this for geo). Not a must-have, but a nice thing to have. Architectural Aspects ---------------------------- Location-to-URL Mapping Architecture and Framework http://www.ietf.org/internet-drafts/draft-schulzrinne-ecrit-mapping-arch-00.txt Emergency Context Routing of Internet Technologies Architecture Considerations http://www.ietf.org/internet-drafts/draft-polk-newton-ecrit-arch-considerations-01.txt These drafts were not directly discussed. Architectural discussions took place as part of the other agenda items. Discussion about ECRIT Working Group milestone adaptation ---------------------------------------------------------------------------------- We think we can finish the requirements document before the IETF meeting. We talked about what "one milestone" means on our charter page - "submitted to the IESG" is the traditional understanding. Roger can get out the next version in an hour or so (he's done everything from yesterday except the MUST/SHOULD/MAY review we talked about). There's no reason we couldn't have a working group last call before the IETF. We had some discussion of whether we submit one draft (with all the changes) or two drafts (one immediately, one after Roger does the MUST/SHOULD/MAY pass). We also noted that ADs get lots of post-WGLC drafts for AD review at the same time, immediately before IETF meetings when ADs are busy. Jon mentioned that the IESG is trying to get a lot more aggressive on document processing commitments, but this won't be in place in a month. We're planning for WGLC in February on requirements - Henning wants to do WGLC for the URN draft in the same timeframe, he's folded in all the comments except the motivations section already. Andrew was concerned that we might be rushing things through (URN isn't even a working group draft yet) - not that Andrew has a problem with the draft, he's just afraid someone else will, in a very tight timeframe - we probably can't spin the document again after WGLC, but before the ID cutoff for IETF 65. Ted pointed out that most URN drafts are being processed as Individual drafts anyway - so we can increase the IETF last call timeout and overcome any possible objections that we are "rushing something through". Tom has also gotten a good start on the revisions we discussed yesterday for the threats draft. Marc noted that the people who have commented most on the threats draft aren't present, so we don't know what their response will be when we post the next revision. We'll submit the draft this week. Can some people review it next week? Andrew and Spencer will look it over - also, Brian. What's the IESG submission date? Surely not July - probably May 2006. Henning - we need a monthly reminder of current deadlines, or we'll be at the next IETF trying to figure out why nothing happened. Andrew - and that may not be enough :-) Henning - what about working group write-up? These documents aren't too complex - shouldn't be a major obstacle. We have BCPs on the current charter that probably need to be PSes. There needs to be a phone BCP and a proxy BCP - that's it. This isn't physical location (that's IEEE land, etc), it's about how we use DHCP and similar mechanisms. Should we have milestones that we don't have drafts for yet? Brian - we should have two more milestones - one on architecture, one on phones and proxy BCPs. "After the mapping protocol" is fine. Henning - too many hours in EU meetings trying to figure out what a charter item means without a draft. Brian - can we use the existing architecture draft for a milestone? If we have more than five or six pages, we've got too much in the wrong draft. Will this be an update to the phone requirements BCP? Maybe, but we don't need to decide this now. Ted will work on the draft about discovering media stream types - recognizing that we may not be using SDP if we're supporting more than SIP. Jon - ADs won't stop anyone from working on architecture, but don't want to list it on the charter page until we deliver SOMEthing. Spencer - if the two drafts going to WGLC count as SOMEthing, we're only talking about a one-month delay. Jon - I'm not holding this up until the mapping protocol is finished - "push two off and add one" is fine. Does the mapping protocol have a prayer of going to IESG in August? We have three protocols, and all would need work. Say "November", come in early, throw a party. We're down to four milestones on the current charter (not including architecture, which we expect to add). What about the security threats beyond the mapping protocol? When do we look at this? The current draft will also have URN threats. We'll review the charter in Dallas, to see if we can add architecture (and anything else we need). Session Initiation Protocol Location Conveyance ---------------------------------------------------------------- http://www.ietf.org/internet-drafts/draft-ietf-sip-location-conveyance-01.txt James presented the current draft. He is removing lots of requirements, restructuring, and moving the examples to a separate document. There was a competing draft that was very strange (wanted location without using presence) - want to make sure that we don't get caught in the same trap again. Hannes - missing motivational text for some of the requirements - didn't understand why requirements were stated the way they were stated. James has deleted a lot of requirements (maybe this would help!). James and Brian had motivations for emergency section, but not for all. Do we want to know why INVITE is a requirement? Tom - SHOULD requirements should explain exceptions - although we only have one SHOULD. Jon - we usually have "Table 2" in SIP drafts (messages where a header can appear) and don't usually have motivation text for these requirements. James doesn't expect Location in ACK/PRACK/BYE but thinks there's no reason to rule them out. Jon thinks UAC-5 is a SHOULD that needs guidance. Brian - there's usually stack code that checks whether a header is allowed in a message - that's what "O" in Table 2 means. Hannes thinks UAC-7 should be related to Section 10 in the same document. Jon is concerned about PIDF-LO as a "SHOULD to use" - UAC-8 should be a MUST, because the PIDF-LO could be conveyed by either value or by reference - but a PIDF-LO will always be understood, either way. This is especially important since you are throwing the PIDF-LO over the wall to an entity that you have no relationship with and no opportunity to exchange capabilities with. UAC-10 - would we ever want to use location in a message body? It would allow integrity protection... Do we need a requirement to NOT include Location in a PUBLISH, etc.? We just need to NOT reject the request. Jon - would we ever use a non-presence-related package with location information? Apparently not - or at least we would need to update the document to describe this. UAC-11 will be deleted. UAC-12 - will take out auth, say confidentiality. James has a problem with removing emergency requirements from this document. Jon points out that "support for emergency services" would give the location conveyance a reference dependency on Ecrit drafts. UAC-13 and UAC-14 are emergency-related - flag them as such. Jon - what are the high-bandwidth slides we need to review? James - I don't know - things that seem simple lead to discussion. UAS-7 - Jon says we need to think about what happens when a UAC is totally clueless, and not just reject calls with 4xx codes because they don't have location information. UAS-9 will be deleted. Brian - we should be silent on what a B2BUA can do. Jon - we need to be silent in order to survive the SIP working group. Most of Proxy requirements will be removed. In the Location header, is there a purpose for the cid? Jon says, "yes" - should be identifying specific multi-part body part that contains the Location information, so we don't have to spin through MIME to figure this out.