Minutes - XMPP Interim Meeting - Monday 7 February 2011 - Diagem, Belgium

Discussion Summary: 

Status: The XMPP trilogy drafts are all approved and in the RFC editor's queue. 3920bis has a dependency on the tls-server-id draft, which is also approved and in the editor's queue.

3920bis issues: Two minor issues were brought up concerning 3920bis after the draft was approved. The room had a general consensus on how to fix them. If the AD approves, these will be fixed in AUTH48. Otherwise, they will be entered as errata. See raw notes for details.

Internationalization of JIDs: Peter presented a tutorial on i18n issues in JIDs. There was a general understanding that any XMPP work will depend on the output of PRECIS, and that XMPP participants are exhorted to help out in PRECIS. See raw notes for details.

Server Delegation: Richard presented on the use of DNSSEC for server domain delegation. The concepts were generally well received. Richard will work with the authors of the original DNA draft to put together a new draft as a charter work item candidate.

[Raw Notes Follow]

------------------

XMPP Interim meeting: 

7 February 2011

Notetakers: Richard Barnes and Jonathan Schleifer
Jabber scribe: Dave Cridland

Chairs introduction
-- Note well
-- Agenda review
    -- Hildebrand: Want to discuss whether to have an XMPP meeting at IETF 80
-- Document status
    -- Trilogy in the RFC editor queue (-3920bis, -3921bis, -address)
    -- Dependency on tls-server-id, but it's basically approved
-- Ben: Need to move energy from the trilogy to new milestones
-- Agenda bash: 3920bis moving forward in agenda

Peter St-André: 3920bis Fixes
-- People have found two minor changes for AUTH48
1. SASL mechanism order (3920bis, Section 6.3.3 see slide 4 for detail)
  - slide 5 outlines two problems
2. Namespace Prefix Enforcement (3920bis, Section 4.8.5 see slide 8 for detail)
  - slide 9 outlines that some servers are not knowingly violating it and it also may violate Postel's law (Referencing Mail http://www.ietf.org/mail-archive/web/xmpp/current/msg02367.html )
  - Proposal to require servers not to send on broken data
Q: Ralph: Some server impl don't do namespace checking so it may not do namespace check correct 
A: Trying to do the least intrusive thing (it's either do that (that == change MUST NOT to SHOULD) or submit an Errata about change)
-- General agreement that these changes are OK

Peter St-André: Internationalized JIDs
-- Peter is doing a recap of basic Unicode facts from his previous talk
-- Q: Ralph: Does the decomposition change the order of the combining characters
   -- A: Yes.  It arranges them in a canonical order.
   -- Alexei: It always returns the same results
-- Overview of the goals of the PRECIS working group
-- Q: Ralph: Is PRECIS also looking at confusable characters?
    -- Stpeter: Don't know.  Maybe
    -- Hildjj: Think a lot of people expected stringprep to deal with confusables, but it doesn't
    -- Hildjj: Not enough information in Unicode tables to deal with confusability, need to do something else
    -- Ben: Remember that these WGs are open, so you can participate
    -- Stpeter: There's also the font question which the XMPP level ignores
    -- Richard Barnes: Suggestion that you talk to some of the TLD registries (like .ee, .de, .ae, .中国, .рф) for tips on confusables
-- Localpart vs. Resource part - do we continue to use/treat localparts and resourceparts differently?
    -- Ralph: Concerned about MUC handles
    -- Matthew: At least in MUC nicknames, it's nice to have spaces
    -- Ralph: MUC could also change to handle nicks differently
    -- Remco: Maybe the problem is that MUC shows the resource to the user; handle it properly
    -- Hildjj (as floor participant): If we disallow interesting things in nicks, people will escape them -> gets ugly
    -- stpeter: PRECIS will probably have something like a free-form string class, should probably use that for some cases
    -- Cridland: Would be disastrous to try to re-write large parts of MUC
    -- hildjj: We're at least going to have to change away from stringprep
        -> Not saying that we might as well change MUC while we're at it
    -- stpeter: IDN has the idea of "domain slots"; xmpp has "jid slots" and "localpart slots", etc.
    -- Cridland: Don't want to revisit things like spaces in localparts
    -- Nathan: Why do we have to decompose/do anything with resource parts?  Very session-based, not manually entered, not really being matched
    -- Stpeter: You get that from the "free form text" PRECIS profile so comparisons work
    -- Cridland: Our implementation doesn't implement resourceprep, never had a problem because of it
    -- Ralph: We store nicks, you can register them; if you don't normalize, then you could have visually identical, canonically identical nicks in the same room
    -- Cridland: Probably make a distinction between nicks in chat rooms and resource paths in general
    -- stpeter: Might define different normalizations depending on usage
-- Normalization forms: Is NFKC too smart?  Can we get by with NFD/NFKD
    -- hildjj: One of the reasons for the *C forms in stringprep was that font renderers back then weren't smart enough to handle combining characters; now they are.  So using *D forms might make sense now, with CPU/complexity savings
    -- Miller: For length limits, could *C could be better than *D; # of code points closer to # of glyphs
    -- Per Gustafson: If you have a really large set of rosters, and you're going to rewrite a lot of things, could be a lot of difference between these; could be good to get some data
    -- Ralph: If there's a possibility that we're going to use NFD, do we want to look at mapping characters as well?
        -- stpeter: What do you mean by mapping?
        -- Ralph: e.g., case folding
        -- hildjj: Would oppose to requiring people to store strings as entered (not canonical)
    -- Melnikov: There are some pathological test cases
    -- Ralph: What does IDNA2008 use?
        -- Stpeter: They didn't choose, but they disallowed compatibly-decomposable characters
- Character mapping: Case mapping, width of asian characterers 
    -- hildjj (as chair): Does anyone have an opinion about things like ROMAN NUMERAL 4 (IV)
    -- Ralph: Uncomfortable with losing information
    -- Zeilinga: We shouldn't go out of our way to accommodate weird things, focus on important stuff like peoples' names
    -- Ralph notes roman numbers are actually parts of names
    -- Joe (floor): From the Unicode perspective, only the semantics are different
    -- Ralph notes that characters like quotes are actually handled content-sensitive in applications
    -- Joe: Do we want to just forbid half-width characters?
    -- Joe: Need to also give due consideration to non-English speakers
    -- Ali Sabil: NFKC seems like the safest option
        -- NFKC doesn't change scripts to where it becomes unreadable
    -- Nubuo Ogashiva: In Japan they don't care about issues like 1 vs l
    -- stpeter: 2 ways to do things: Either map or just disallow (like IDNA)
    -- Kurt: Don't necessarily think that special things should be disallowed, more an issue of focus
-- Locale-specific issues; maybe limit things by geographic scope?
-- Registrar-like policies, especially around mixed scripts?
    -- Joe: Concerned about restrictions, since we run a global service
    --  Richard: More of an operational/best-practices issue than a protocol/processing issue
        -- So software should handle everything, but your service can put constraints
        -- Joe issues that the rules that apply for domains might not work for users if they are just copied
    -- Ralph: Any informational stuff on this out of IDNA? Do we want to define such documents?
    -- Joe (chair): We need to at least try to show some work in PRECIS
    -- stpeter: Prohibit mixed scripts? Encourage clients to warn about them?
    -- Jonathan Schleifer: We can't fix it, as we even have that problem with domains and we can't just forbid domains
-- Need to identify all the "JID slots", maybe node IDs as well
-- Enforcement & Error Handling: Where should enforcement happen?
-- Migration: How do we transition to the new handling techniques?
    -- How many people are using non-ASCII characters now?
    -- Overlap with the general question of how mismatches are handled
-- Procedural: How do we interact with PRECIS, SASL, etc.?
    -- Ben (floor): Don't focus on which WGs are doing the work, best to reuse stuff
    -- Alexei: Second Ben's comment; doesn't make sense for each application to do this stuff independently
    -- Cridland: How much global coordination do we need?  Maybe just have some very light coordination on normalization?
    -- Ben (chair): Clear indication from IESG that they want this to be a priority
    -- Ralph: With IDNA2008, won't we already have problems with the domain part of JIDs?
    -- Peter, Alexei, Ralph, Joe: Do we need an official statement that domains not conforming to stringprep won't work with XMPP?


-- Richard Barnes: XMPP DNA: What's the problem, does DNSSEC help, if not: Alternatives?
-- Hosting providers can't hold customer certs for security reasons
-- Two differnet channels for each src-dst pair → too many sockets
-- Requirements: Need a way to verify that the server actually sent the SRV and not some attacker. A way to sign a redirection
-- DNSSEC signs responses, signed SRVs
-- if (dnssec && dnsName == dstName) → success
-- Joe (floor): We might still want some of the structure from the original DNA
-- Dave Cridland: Fully compatible with Dailback
-- DNSSEC usage still low
-- Jonathan Schleifer: DNSSEC might be easier to break than TLS, this approach then makes it possible to completely circumvent TLS
-- Joe (floor): DNSSEC will be used a lot, so we are having other problems if it's broken
-- Richard references draft-barnes-xmpp-dna (non-DNSSEC solutions)
  -- how do you encode the delegation?
  -- how do you *find* the delegation?
  -- how do you trust the signing key?
-- Dave Cridland: SRV, or CNAME? (Specifically, using a CNAME in response to a SRV request to indicate a delegation)
-- Joe (Floor): Not clear at which name to look when using CNAME
-- Ralph: If a CNAME is used, no SRV is checked for the new destination
-- Florian Jensen: SRV can point to a domain that is a CNAME - shouldn't make a difference?
-- Jonathan Schleifer: Just use HTTPS to do redirection and handle it like the user entered the domain from the redirect?
-- Ralph: Nobody besides XMPP is using something similar
-- Joe (chair): DNSSEC+DNA might be a way to go

End of meeting.

--------------------------

Here's the stuff I wrote before:
MINUTES

Issue #1: SASL Mechanism Order

Peter Saint-Andre explains that no other protocol specifies order of SASL mechanisms, client ignores the server's preference order. Proposes to change "A server MUST offer and a client MUST try SASL" to "A client MUST try SASL".

Joe Hildebrand asks whether someone has objections against this change. No objections.

Issue #2: Namespace Prefix Enforcement

Peter Saint-Andre explains that fact that a server MUST NOT route a stanza with a prefix for the default namespace might causes problems with servers not really knowing about it because their parser hides it. Proposes to change it to SHOULD NOT route it without first correcting the error, but instead SHOULD either ignore it or close the stream with a stream error.

Joe Hildebrand asks whether someone has objections against this change. No objections, most in agreement.

Issue #3: Internationalization

Peter Saint-Andre gives a short overview of how Unicode works and explains why the order of combining characters is important. Ralph Meyer asks whether normalization puts the combining characters in the right order. Joe Hildebrand, Alexey and Peter Saint-Andre say yes.
Peter Saint-Andre goes on to explain that IDNA2008 moved away from stringprep for domain names, but PRECIS provides similar "services", mapping rules probably included.
Ralph asks if the PRECIS group will deal with confusable characters, Peter Saint-Andre he's not quite sure whether they're going to do it, suggest we need to do a lot ourselves.
Joe notes that back when stringprep was introduced it did not handle confusable characters and it has caused a lot of issues, but there's not enough information in the Unicode tables to deal with confusable characters.
Ben encourages everybody to join the mailing list and help working on PRECIS.
Richard Barnes suggests the WG contacts the TLD registries for confusable characters.
Peter asks whether we want to continue the localpart and the resourcepart differently. Ralph wants to raise that we would be loosing functionality if not. Matthew Wild notes we would not want spaces in the localpart, but in the resourcepart. Remko Troncon notes that we only actually show the resource to the user in MUC and that we should maybe just have another way for the nicks in MUCS. Joe notes as an individual people would probably just start escaping things. Peter notes that people like the expressiveness and that he doesn't see a reason to get rid of that. Dave Cridland notes that it's questionable whether MUC will change how nicknames work. Joe (as individual) notes that all the code handling JID matching etc. needs to be changed anyway once we got rid of stringprep. David thinks we need to try to maintain as much compatibility with stringprep as possible and that we don't have the option anymore for drastic changes. Ralph notes that we are going to face issue because of new Unicode characters being added. Peter agrees that big changes to localpart should not be done. Nathan Fritz asks why we decompose resource parts. Peter replies that you want decomposition for stuff like kicking the right person from the MUC. Dave says they never did resourceprep, but it never caused them any trouble. Ralph notes that MUC nicks can be registered and stored and thus normalization is required. Dave suggest making a differentiation between nicknames in chatrooms and resources, as we have problems in chatrooms which we don't have in resources.
Peter summarizes that we should do some addition checking for resource parts in MUC.
Peter asks if we really need recomposition, as font renderers are smart. Joe notes that font renderers did not produce the same characters back then when they composed, but do nowadays. Matt Miller asks about limitations when we have to know so many characters. Jonathan Schleifer that many characters are not affected by this and a sparse array can be used, thus not being a technical limitation. Peter sees changing from NFKC to NFD mostly as a migration problem. Per Gustafsson proposes to look at existing data for that. Ralph asks whether we want to do mapping characters as well. Joe (individual) rephrases it to "store everything in the unmodified format" and opposes to it. Joe mentions some weird corner cases, Alexey suggests to document those. Ralph asks which choice IDNA2008 made. Peter says declines to make a choice. Alexey says they kind of did NFKC. Peter expains they filter by Unicode attributes.
Peter asks if we want case folding for username, but not for free-form strings. Peter does not really see a reason to change this. Joe explains full-width, half-width and narrow-width characters of East Asian characters. Ralph is uncomfortable with losing information.

SCROLL UP FOR CURRENT STUFF!