I don't want to be facing 8-bit bugs in 2013

"D. J. Bernstein" <djb@cr.yp.to> Wed, 20 March 2002 04:40 UTC

Received: by ietf.org (8.9.1a/8.9.1a) id XAA14807 for ietf-outbound.10@ietf.org; Tue, 19 Mar 2002 23:40:04 -0500 (EST)
Received: by ietf.org (8.9.1a/8.9.1a) id XAA14681 for ietf-mainout; Tue, 19 Mar 2002 23:33:55 -0500 (EST)
X-Authentication-Warning: ietf.org: majordom set sender to owner-ietf@ietf.org using -f
Received: from muncher.math.uic.edu (koobera.math.uic.edu [131.193.178.181]) by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA14667 for <ietf@ietf.org>; Tue, 19 Mar 2002 23:33:50 -0500 (EST)
Received: (qmail 6101 invoked by uid 1001); 20 Mar 2002 04:32:51 -0000
Date: Wed, 20 Mar 2002 04:32:50 -0000
Message-ID: <20020320043250.20715.qmail@cr.yp.to>
Mail-Followup-To: ietf@ietf.org, iesg@ietf.org, iab@isi.edu, idn@ops.ietf.org
Automatic-Legal-Notices: See http://cr.yp.to/mailcopyright.html.
From: "D. J. Bernstein" <djb@cr.yp.to>
To: idn@ops.ietf.org
Cc: ietf@ietf.org, iesg@ietf.org, iab@isi.edu
Subject: I don't want to be facing 8-bit bugs in 2013
References: <E16nW11-000JhG-00@psg.com> <20020319204407.M32897@iconoplex.co.uk>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
Sender: owner-ietf@ietf.org
Precedence: bulk
X-Loop: ietf@ietf.org

Paul Robinson writes:
> You tell him that although it's gobbledygook to people without greek
> alphabet support, it will still work. It's not convenient, but it WILL
> work. Guaranteed.

False. IDNA does _not_ work. IDNA causes interoperability failures. Mail
will bounce, for example, in situations where ASCII domain names would
have worked fine. IDNA coauthor Adam Costello has admitted this.

> And that maybe replacing the DNS resolver on all the machines out
> there to be able to do lookups with PunyCode might be a TAD more 
> realistic than trying to get EVERYTHING, EVERYWHERE to be good with
> 8-bit?

Here you are assuming that the only problem is the DNS resolver---that
the conversion between the local character encoding and the IDNA
character encoding can be handled entirely by the DNS resolver.

That assumption is false. Consider, for example, an MTA configured to
accept mail for pi.cr.yp.to, with a Greek pi. The MTA compares the
incoming domain name to pi.cr.yp.to. That doesn't involve the resolver.

People who say that IDN is purely a DNS issue are confused.

> Making every piece of software and display device that might ever have
> to deal with IDNs capable of handling UTF-8?

Here you're being simultaneously inconsistent and shortsighted. Fixing
bad displays is part of the cost of IDNs. In the context of UTF-8, you
agree with me that this is a cost; in the context of IDNA, you ignore
the cost completely.

In fact, the cost of fixing UTF-8 displays is much _smaller_ than the
cost of fixing IDNA displays. UTF-8 has been around for many years, has
built up incredible momentum (as illustrated by RFC 2277), and already
works in a huge number of programs.

The extra programs hurt by IDNA aren't just UTF-8-aware clients. Fixing
the IDNA display failures also means changing web servers, mail servers,
DNS servers, etc., so that the sysadmin can put a properly displayed IDN
into his server configuration files. Think about the above pi.cr.yp.to
example again.

> The solutions you do offer will take at least 4 years IMHO to be
> effective

Let's suppose 4 years is right, and let's compare the results to IDNA
after 4 years.

IDNC3 requires 8-bit fixes to some widely deployed programs, certainly.
But IDNA needs _much larger_ changes in _many more_ programs. So, after
the same 4 years, only a fraction of the IDNA work will be done. IDNA
will still have an incredible number of display failures, plus the
interoperability failures and all the other IDNA problems.

Even worse, IDNA doesn't do _anything_ to fix the other half of the
email problem. Do you seriously believe that Chinese users will be
satisfied with email addresses where the domain part can contain Chinese
characters but the box part is still required to be ASCII? It's obvious
how to fix this with UTF-8; how, pray tell, do we fix it with IDNA?

I presume that you're not one of the 7-bit-forever crackpots. How do you
propose migrating from IDNA to UTF-8? This is much more costly than
moving directly to UTF-8, because it needs a compatibility period during
which everyone supports two different encodings of the same character.
Doesn't it bother you that the IDNA documents don't discuss this at all?

What makes your position particularly shameful is the fact that people
proposed requiring 8-bit transparency _eleven years ago_. If it hadn't
been for Paul Vixie et al. making your ``it'll take years!'' argument
back then, we would have had 8-bit transparency today. Do you want to be
facing the same stupid bugs in another eleven years?

> Until then, he has to get a 'normal' domain to see himself over.

Correct. Your example Greek user has an ASCII domain name that's always
displayed with an ASCII d instead of the truly desirable Greek delta. 

Now, please explain why the same user should prefer a domain name that's
_occasionally_ displayed with the desired delta but _usually_ displayed
as incomprehensible gobbledygook.

Your answer, of course, will be something like this: ``The gobbledygook
is a temporary problem. In twenty years, after the massive IDNA upgrade
is complete, everyone will see a delta!''

In short, you're looking at the long-term IDNA benefits (never mind the
interoperability failures and all the other problems) but refusing to
look at the long-term UTF-8 benefits. Inconsistent once again.

> Something should be done, but your document make you look like a
> typical whiner - you point out all the problems, but offer no
> solutions to some of the problems you raise.

False. http://cr.yp.to/proto/idnc3.html explains how IDNC3 offers
solutions to every one of the IDNA problems that it points out:

   * interoperability failures;
   * inconsistent displays of the same name;
   * unnecessary implementation and deployment costs;
   * multiple semantically similar names;
   * identical displays of different names; and
   * typing failures.

Each solution is listed right next to the problem, so I can't imagine
how you missed this.

> What you are proposing IS introducing an interoperability failure,

False. Every step in http://cr.yp.to/proto/idnc3.html preserves
interoperability.

> How are you proposing to display alpha-ol.com on a VT100?

There are several options. One option is to work around the hardware
limitations in software, displaying something like

          |
   /\/ /\ |   /^ /\ /\ /\
   \/\ \/ | * \_ \/ | | |

Another, much more popular, option is to move your email reading, web
browsing, etc. from your 1970s-vintage VT100 to a graphics terminal.
Have you considered the VT340, for example? Or an IBM PC, model 5150?

> a good 20% of sites out there will just have to shut down ops permanently

Get a grip, Paul.

---D. J. Bernstein, Associate Professor, Department of Mathematics,
Statistics, and Computer Science, University of Illinois at Chicago