PRECIS Working Group - IETF 84 - Vancouver Meeting Notes ========================================== WG Doc Status Update/Discussion ------------------------------- - Problem Statement - WGLC done, on to IESG * Pete Resnick (PR) - Am I still waiting for writeup? * Yoshiro Yoneya - Peter St. Andre is shepherd. He will send you writeup. * Alexey Melkinov (AM) - procedural regarding I-D: are they going to become WG items, or remain individual * Marc Blanchet (MB) - That's still for the WG to decide * AM - Expected SASLprep to be adopted, but have not heard yes/no. * MB - Will discuss each document as we get to it Mapping Characters for PRECIS Classes ------------------------------------- 1 - Overall 2 - Types of mapping 3 - Applying order of mapping 4 - Next step * Joe Hildebrand (JH) - Do you intend for you section on special mapping to include all the special mappings we know about? The draft has 5 or so mentioned; is the list exhuastive (is that all of them?). * Takahiro NEMOTO (TN) - It is. * Yoshiro Yoneya (YY) - Those are codes as far as we know. * JH - If that is, then I object. There's far more, and we'll need to mention them * John Klensin (JK) - This starts with non-ASCII mappings, which results in interesting issues. * JH - Is there a place we can go to learn about these more interesting code-points we need to know about? I would like one canonical list; if you could point us in the right direction. * JK - The two places you can start: The mapping document for IDNA (2008?), and the Unicode document UTR-46. That has all the good news/bad news of DNS context. There are a few other WGs, so where do you want to go for this wild goose chase. * JH - Is there a place we can ask for suggestions with worthwhile input? * JK - Every time someone looks intensively at a script, something comes up and bites you. You need to ask this on a per-script basis, and the IETF doesn't know who it should talk to. * JH - I suggest we start a registry, that might start with 4 or 5, so that we can point people to and build on. * JK - Every time you do this in a registry, you have the problem with how you adapt this to implementations. * PR - What is the purpose of this document? For an IANA registry, you'll have one column with a codepoint, and then you will have a column for "what do I do with this?" When we started with 2008, we started with the most straight-forward mappings, and left the rest as an exercise of the reader. We decided to do this because it was a localization problem; dependening on where you are, and where the service is, you do very different things. So do you want to have the ones that are very clear, and * JK - For some, these are the ones you want to do unless you're in Turkey. And these are the things you want to do if you are. Then you need to have a mapping for the things that are not valid for a given region. * JH - An approach is a registry with the codepoint from, codepoint to, apply in these contexts, never apply in these contexts * Andrew Sullivan - So where is the context registry * JH - For example, this is a language code for our Hangul. It's imprecise, but it's something we can implement against. * PR - I'm not sure a registry will be that much better than an informational draft. This set of characters and that set might get you about 5 paragraphs of why. TR-46 is probably not a good place to start, because it's how to map from 2003 to 2008. This is more general, and that was very specific. * JK - I only mentioned UTR-46 on how these things come about. If you are looking for another example, there is a document about 100 pages long that only covers 5 characters (ICANN issues report). * JH - What if it was an IANA registry with char from, document on what/why/when to transform. * JK - That could be a registry with 10k+ characters in it * JH - I'm ok, because it might lead to a process * PR - My concern is this is what the UNICODE consortium should be doing, and we're moving into there juristiction here. * Peter Saint-Andre (PSA) - What this document is doing is representing the special casing from the UNICODE document. Is this enough, or not? * JK - If the question is this recues us? This is not it. UNICODE is not designed around the 7 issues here (or in IDNA). This issue is hard, and the only general solution is to say "a codepoint is a codepoint, and it either matches or it doesn't". And then we get into the position of everything is ok except for these few little special cases. Or we go the reverse, and we get into serious trouble by region (e.g. Arabic). * JH - I understand, but I have software that has to run in a bunch of these regions without modification. If we could ask the UNICODE people to help us a little bit (locales, context, etc), a set of mappings for that context ... then we can work back from are we in that context, then I can figure out the right context and how to map. * JK - The expectation that you'll have a single implementation that will work everywhere is not possible. So really this is comes down to localization, or one punts. What you can't do is take a character and figure out how it maps. It depends on the language, region, etc. * AS - Part of the problem is that we really have two slightly different problems being mashed together. 1) Figure out what mappings you need to do, and 2) how to map. One way to do this is to add a mechanism for what mapping to use, then we can do that. * JH - That's exactly what I'm asking for. I've got xml:lang today, but if it's not enough then we need to know what else. We've got some context now, and am happy to add that context to the mapping. * PR - So JH wants something in between the super-obvious mappings and the bad mapping (NFKC). So now you have the flag (xml:lang and/or locale tag), then the number of mappings becomes interesting. Turkish is the easiest; don't do that mapping, but do this one. In the US, do the opposite. My guess is this will be locale-by-locale, and do you want the IETF to be tracking that. * JH - If the UNICODE folks are not willing to do it, then somebody needs to, and it can be us. We can start the registry with a small list of the things we clearly know about, and this table stays small. * PR - This is ok to start. However, depending on how we setup the registry we might get pushback from the UNICODE people, and want us to shutdown once they do it. Until then, we run the registry conservatively until told to do otherwise. * JH - I'm willing to add text that says "Hey UNICODE, if you want to do this, please!". * JK - OK, with two caveats: 1) There is history and possibility with a fit but no registry, and we need to be prepared for that to happen. 2) And JH was talking about taking over, and that usually has problems with backwards compatibility; expectations and requirements can change between two groups. As long as we're comfortable with some compatibility issues, then we can do this. * MB - It sounds like we have an update to the document * JH - 1) We're not going to adopt this until you make it clear what we're doing. 2) Define a registry, with the caveats mentioned earlier. * JK - Those interested in this problem, is that no one agrees on when a number is a number. * PSA - I really think they're talking about UNICODE special-casing, and I don't think we need a registry. I think we need to discuss more on the list. * JH - I think we have a registry of what the mappings possibly are, and we hope that the UNICODE people take over this registry, and we accept that. * PR - We can say this registry is a stopgap, and expect someone (UNICODE) to come up with a more complete explanation on mappings. We're not declaring ourselves experts, but that we need to try and get something done. * JK - And it's good that we don't point specifically at the UNICODE consortium, but that we explain our problem for someone to adopt. * YY - So do we take this version, or want a revision * MB - We probably want to wait for another revision before we try and adopt as a WG item. * PR - Procedurally, it sounds like we know what direction we're going, so you might want to wait for the next version, once it explains the registry items more. PRECIS Framework ---------------- 1 - Activity since '83 2 - Processing Order * JK - IDNA2008 only talks about validation on normalized strings. I'm not sure what you're getting at, but IDNA2008 talks about normalization then validation. * PSA - I'm sorry; this slide is exactly wrong. We currently specify validity-then-normalize, IDNA specifies normalize-then-validity 3 - Space: Problem 4 - Space: Concerns 5 - Space: Solutions? * M&M - I talked with Alexey on this. There are still a lot of people using spaces, since that's the way their LDAP was set up. DN for example. We might not be able to switch people away from this. * PSA - agree. how do we handle? Think I prefer doing this in the application layer rather than coming up with another base classe. * Resnick - are there any protocols that allow other-than-ASCII-space? * M&M - I've only ever seen space * Resnick - for SASL, etc. they should do an application-layer parsing bit where they allow compound = name [1*(SP name)] * PSA - need to take this to security, not enough input from LDAP. RFC ? maps all space characters to U+0020, so we should be ok. * Alan DeKok: RADIUS says usernames are *anything* UTF8, but in practice people avoid spaces because they break things. Things may not as bad as we think. * Marc: this is just a framework with base classes * Alan: RADIUS doesn't want to define its own * Marc: that's why we're here * JCK: do you require those strings to be normalized? * Alan: no. it's less of a problem in practice than you expect. the names and passwords tend to be opaque blobs. all that's important is that it's the same everytime. * JCK: RFC ? specifies normalized and that unassigned codepoints MUST be rejected by intermediaries, but at least everyone ignores the doc today. * Pete: Alan has confirmed what we thought about NameClass. Stuff that didn't work will be fixed. Alexey, does compound name approach work? * Alexey (anonymously): that's a great hack. it kind of works, but not sure why we make it so complicated * Pete: We're trying to come up with a generally useful class of what a username is. We know that spaces get screwed up in very places, when it's supposed to be used as an identifier. We've also heard they're ignored in RADIUS for some reason. This is a solid class WITHOUT spaces, but is really painful if we need them. If SASL was mapping them, and SASL needs space, and just throwing in the space is ok, but for future protocols is to just throw out spaces. * JH - Having this would make it useful to disallow leading and trailing spaces. This is also precedent in DNS, where it's label.label, not just label. * Anon - It's ok, but it's hackish. * JH - I'm happy to be unhappy for you. * PSA - I agree with PR, and this disallows us from using spaces in future protocols * MK - Just saying "SHOULD NOT use spaces" would be just as good. * PSA - The conclusion is that Alexey and I work on a -01 that includes the CompoundNameClass and has some other fixes. 6 - Remainint Tasks * Once the space issue is done [and this list], then we might ask the WG if it should be adopted. Replacing SASLPrep ------------------ 1 - Recap 2 - Open Issues (1) 3 - Open Issues (2) * AM - Ideally, I'd like to try the new algorithm, then try the old algorithm, and see the output * JH - Even before that data comes in, I think I'll be telling my users that if there's an incompatibility issue, then "too bad", because they're probably attacking us in some way. * PR - Are you storing the pre-mapped or the post-mapped * JH - We only store the mapped names, so if they stop typing the stupid thing, they'll start working * PR - That's ok * JH - I don't think anyone is actually doing this, but if they are it's because they are trying to attack us * MB - How many have read the document (*crickets*). * MB - I think we need more reviews. * PR - I think you still want to ask adoption, unless you want to completely rule it out. I have absolutely no issue with adopting this draft, but the chairs need to be aggressive about chasing reviews down. * MB - I see it as a two-step process: start gentle, then pursue more aggressive. * MB - I'm going to look for review, then call for adoption once we have some reviews. * AM - I'm sure we'll post -01 very shortly. * MM - I was about to read it, then was told that -01 would come out soon. I'll read it once -01 is out PRECIS Nickname --------------- * MB - How many have read the draft (a few hands) * MB - I'm not sure if this one that depends on the framework will create a queuing problem * PSA - I think the nickname document is pretty straightforward, I'm not deeply concerned about techincal issues, but really time-delays. * PR - Because of the dependency, and some people will want to get this out the door ASAP; I don't like individual-submissions because I want the shepard to do the same things that WG chairs do, then I'm happier if this were a WG item to shepard this through; it looks closely related to the WG that it would be fine with me, and this WG knows the right people to go to. * MB - I think this document is within the charter (this is part of replacing stringprep). * PSA - For those really wanting to get their document published that are dependent on this document might be coerced into reviewing the framework document. * MB - If this goes to WGLC, then we would forward to other WGs that are using this. * Ben Campbell - I'm the chair of SIMPLE, and we've reviewed it for us. All of the feedback was positive, the only concern being the timing. The general thought was "I'm really glad Peter did this because we didn't have the expertise to do this". We wish we weren't sitting in queuing for it, but are resigned to do that. * MB - Readers: do you want to adopt this (some hands). Opposed (none). * PSA - This is also applicable to XMPP, and we should get feedback from there, but some of them found this acceptable. * JH - I'm fine with this document, but in general are we going to adopt the documents from others that operate as stringprep replacements? * MB - I think so, yes. * JH - If another WG decided to do this work themselves, would we pitch a fit? * MB - We might (-: * PR - As AD, we would push this work into this WG, and we're not worried about WGs trying to keep it. PRECIS Framework Implementation ------------------------------- 1 - Purposes of the implementation 2 - Implementations (1/2) 3 - (2/2) 4 - Findings (1/5) 5 - (2/5) 6 - (3/5) 7 - (4/5) * PR - Question: is this the thing that PSA was referring to about the normalization before validation? And you are getting different results before, then we want to do the normalization before the validation. * JH - To be clear, this is the decomposition. Not just Hangul, there's going to be others. * Joseph - Do these have different widths? Then nevermind (-: * JK - In general, noramlizating before validating solves a lot of problems. * JH - Now is the time to change the framework before anyone else implements it. 8 - (5/5) * MB - Thank you for implementation report, this helps formulate our work. No promises, but we have an implementation that also generates the tables, so a second source would be good. * YY - We want more feedback from this report. We also need other implementations to provide profiles. * PR - This WG session has made me much calmer and happier. It's still pulling teeth to get reviews, but it looks like we're now moving in the right direction. Now you'll need to chase down people to get these reviews, and get multiple angles on each reviews. But once you have that, and the writeups about each, that's exactly what I need. Thank you, and thank you from the authors; this is exactly the information we need.