[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [sasl] Draft StringPrep/SASLPrep slides



On Tue, Nov 10, 2009 at 10:40:13PM +0900, Alexey Melnikov wrote:
> Not really slides, just a collection of issues/wishes for now...

A few comments:

 - SASLprep is SASLprep -- we can only make changes where those changes
   wouldn't break anything:

      Q: Can non-hashed query strings be made unnormalized?
      A: Not if servers expect them to be normalized
      Q: Can we switch to a different NF?
      A: No.

   SASLprep is pretty much written in stone for the mechanisms that use
   it _now_.  We could hold up SCRAM for a new stringprep profile, but
   I doubt we'll want to (we can live with SASLprep, no?).

 - Clearly RFC3454 will have to require some updates if we're to have a
   new profile that we want to call a "stringprep" profile but which
   does things outside RFC3454, such as NF-insensitive string
   comparison, or use of NFs other than KC.

   I don't think this is an obstacle.  But it will slow things down if
   we choose to update/obsolete RFC3454.

 - IMO query strings should be sent unnormalized -- normalization should
   always be delayed as long as possible, or avoided altogether where
   NF-insensitive string comparison will do (see more below).

   Normalization should only happen at certain very well defined points,
   such as:

    - as an internal detail of normalization-insensitive string cmp
    - prior to hashing a string (of any kind, query or storage)
    - prior to storing a string (which is thus a storage string)

 - I've not made up my mind re: compatibility mappings.  Evidently new
   compatibility mappings can be added at any time (or so I'm told),
   which complicates Unicode version agility, which argues against using
   them.

   But use of K mappings removes some confusables, thus seems likely to
   be worthwhile (though, to be sure, we'll never have zero
   confusables).

 - NF negotiation may be appropriate in some protocols, but I can't
   think of which.  Leaving the NF completely unspecified can certainly
   be done, as it was, e.g., for NFSv4, but only if there is a single
   entity that will be doing hashing, and preferably only if a single
   entity will be doing normalization-insensitive string comparisons.

   (In the NFSv4 case there's no hashing, save as an implementation
   detail of the server, and all n-i object name string cmps happen in
   the server.  Therefore leaving the NF completely unspecified was
   reasonable, and IMO fortuitous.)

 - I don't know enough about Unicode version evolution, but if all we
   had to worry about were unassigned codepoints, then we'd be done,
   wouldn't we?  OTOH, if K mappings can be added at any time, then we
   have a problem.  (NFC is closed to new compositions, for example.)

 - NFD makes the most sense to me, and should to anyone who has a
   generic Unicode normalization library: NFD is a step in the NFC path,
   therefore NFD is faster than NFC.  Plus, NFC == NFD, asymptotically :)

   However, some implementors may not have such a generic Unicode
   normalization library, and today may only be capable of NFC, or even
   NFKC.  That wouldn't sway me, but then, the choice of C or D is not
   all that consequential -- not as consequential as the to-K-or-not-K
   choice, for example.

Nico
-- 

Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.