Structural comment below. For Members' coporate people. At 08:02 09/08/2005, Frank Ellermann wrote:
Doug Ewell wrote: >> Fair enough, 3066bis doesn't allow a digit as singleton > Yes it does: > singleton = %x41-57 / %x59-5A / %x61-77 / %x79-7A / DIGIT Sigh, and I thought that I know the ABNF by heart. So far for that idea. Jefsey has to pick another "escape" character for his "not-3066bis-scheme". Maybe "!" or "$". > left over from the days before the ABNF was expanded to allow > digits (I forget when this was). Added in -01, in -00 it was still without DIGIT. No idea what the old pre-LTRU drafts did, I don't find them anymore on Addison's pages. > Judging from all the attention this ABNF has gotten, I'd say > the correct move is to trust the ABNF over the prose. Yes, although I must admit that I don't recall _why_ DIGIT was added. John proposed to sort extensions alphabetically, and IIRC we discussed to exclude "y" and "z" - both ideas were rejected. I wanted "x" instead of "x" / "X" because in ABNF "x" is already case insensitive, also rejected, and that's all I can say without digging through the archive.
Dear Frank,RFC 3066 permitted digits in subtags. What Doug says is "I cannot support it in the document I write since I did not write that I would". This kind of position will obviously not hold for ever in front of IESG, IAB, GAC, WTO, but if can confuse people enough to hold for years. This is why I am spending so much time explaining the obvious again and again. To give "lawyers" quotes enough to discuss the bad or the good faith of the authors.
ABNF is not the point. You may note that only you and Lee, who are motivated in finding a technical solution, in an IETF fashion, are seriously following this technical thread. The point is political and commercial.
Why? And is it bad or not?Just try to figure out who are the customers here. Who has money in language related industries? Who print, sell and buy books? Typographers, Publishers and Libraries. What is their common target? To sell/rent books to readers. Who are the readers? More and more Internet users. The customers of the book industries evade books. Two things to do: cheaper books and to move the book industry into the internet (and in so doing try to protect the existing market repartition and to invade the share of the slow movers) with the minimum of compromise with the natives (Yahoo!, Google, etc.)
Study http://www.unicode.org/history/boardmembers.html and http://www.unicode.org/consortium/directors.html. Unicode is certainly the most active and talented (*) place where this concerns can be discussed and acted upon. It certainly gathers representatives from the market and customer leaders. It is likely that its propositions carry enormous weight, but also call for serious political attention. This attention is from the competition and the partners of its Directors and Executives corporations, from cultural authorities. It is from Governments, due to this weight and the possible direct or indirect impact on the most sensitive human areas: culture, civilisation memory, political influence, religion, human relations, etc. You may have noted the force and the speed of the European reactions to the Google digital library announcement.
(*) Some individual members are: Harald Alvestrand, John Cowan, Martin Dürst, Doug Ewell, Jean-François Morfin; some corporate Members (whose employees also form the broad majority of the consortium directors and executives) are: RLG (whose BoD Member is the only one to stay from the very beginning), Microsoft, IBM, Apple, Cisco. Verisign just joined as a Member.
What is the target of the people gathered in Unicode? "The Unicode Consortium is a non-profit organisation founded to develop, extend and promote use of the Unicode Standard, which specifies the representation of text in modern software products and standards." Unicode is therefore an active participant to the W3C (web standards) and to ISO (international standards).
Now, read http://ietf.org/overview.html. It starts saying "The Internet Engineering Task Force (IETF) is a large open international community of network designers, operators, vendors, and researchers concerned with the evolution of the Internet architecture and the smooth operation of the Internet."
The first problem is here. Unicode (Mark Davis' culture) is about characters in texts and W3C (Addison Philips' cultures) is about Web i.e. modern software to print/display/speak texts. But IETF is about Internet architecture and the Draft does not bridge the huge gap between one of the network application, however big it is and however historically rooted it is, and the network architecture. The only way they have to keep them together is to "constrain" the network into their typographer/printer culture. This obviously cannot work. Typographer, printer and bookstores chains are different trades.
The second problem is that the customers needs are outdated and the book industry does not know how to match our time's challenge. People like Peter Constable and me certainly try to help in our own ways - him as an employee, me as an applied searcher. This is the very same problem as Telcos and music publishers. Book publishers and Libraries have been impacted more slowly and for a longer time than Music publishers and Media, but the problem is similar. Music publishers introduced RPM, iPod, etc. Book Publishers and Libraries introduce RFIDs and langtags.
Is that a bad idea? No.It even looks very promising both for Libraries and Publishers, and for software and service providers. A unique basis to build good organisation and market protection control. Every text can be tagged with a langtag; we will find them everywhere (RFID on the books in bookstores, Libraries, Colleges, etc. Operating System locales - Windows and Linux alike) with an easy to add ISSN when printed. The catalogues, management, organisation savings can be huge for publishers, libraries etc. This is very urgent while the printer increasingly challenges the book. The need is to level the practical cost of satisfaction of a book and of a printout; and to get the copyrights paid in both cases. Same problem as Sony and Napster. The solution is like iPod: the library of Congress on an USB iLib, which can plug on any bookprinter to come, and stay compatible whatever the version of text over the centuries.
At the same time, the langtags makes it easier to rush down the web, reading every web page, sorting them by langtags, reading them for incorrect words, copyright infringement, etc. and printing the daily police/lawyer report for copyright actions.
I obviously only give rough lines and do not go into the details of the way it helps reducing costs, improving quality, increasing security etc. through system convergence, simplification, etc.
So, you understand why this Jefsey is a very bad/poor stupid troll, making "industry" to waste huge amounts of money with his "delaying practices".
You also understand that if his "0-" were to be accepted the whole scheme would be useless. The whole thing is format exclusiveness, so _every_ book and page in the world can fit in a unique industry controlled format.
Every other attempt MUST be excluded. And here is the huge "but". Because it scale and stay.
The supposed retrocompatibility obligation with a never used format is only a bluff. This is probably one of the reasons why the WG-ltru never wanted to consider its Charter. The Charter says something quite different: "It is expected to specify a mechanism for easily identifying the role of each subtag in the language tag, so that, for example, whenever a script code or country code is present in the tag it can be extracted, even without access to a current version of the registry. Such a mechanism would clearly distinguish between well-formed and valid language tags, to allow for maximal compatibility between implementations released at different times, and thus using different versions of the registry."
Jefsey is also quite worrying. Because he works on the matter for 30 years (among other things, I came to networking in 1976 in discussing with Xerox, Compugraphic, etc. about an international network to print books. I studied in 1979 with all the large newspapers not printing photos - Le Monde, la Stampa, Die Welt, etc. - simultaneous locale editions over the international public network. I managed my own daily economical newsfaxer during five years, with editions on Minitel, professional papers, etc.). He knows networks (I am first a Navy communication officer dealing with secure world networks since 1971). He possibly has the technical user-centric architecture to change all that (I do). He knows the future; and the future he knows is probably quite different from the past the industry wants to protect (it extends it, it does not kill it). No way to permit his ideas and demands through. He must be excluded. Whatever the way (including anonymous telephone threads, what next?).
Another problem with Jefsey: he is no English mother-tongue. So he knows more and he is engaged into structures, solutions, propositions that do not want to use English as the language of reference. This is quite worrying for several reasons.
- One reason is that industry is mainly US driven and has no collective knowledge, experience and practice of languages at this level. Multilingualism is a difficult and complex area. Stakeholders prefer to address it as an internationalisation+localisation http://www.w3.org/TR/2005/WD-itsreq-20050805 which permits them (they think, but this is one of the main technical error by lack of complete analysis and modelisation) to simplify the issue and to control it from the computer root (via CLDR: the control of locale files).
- Another reason is that the industry is also conducted by market leaders which happen to be Americans and therefore to benefit from the American commercial environment, and which are under European scrutiny for possible market dominant position.
- Multilingualism is also a threat of balkanisation of the network which would kill the acquired vision of the network and would remove the partial protection provided by "an American built Internet" Hollywood story (Jefsey knows the truth: danger). You can sue a Chinese in the US for having violated a copyright on the Internet. This is more difficult if the violation was on the "Chinese Internet", with Chinese domain names. The law will be the same, the eventual decision will be the same, but the cost balance will switch: it will be cheaper to the Chinese user, and more expansive to the multinational stakeholder.
Another problem with Jefsey is that he only considers the "Web" as one application on the network and not as the main system. The langtags MUST be the internal core of the Internet system to be on every text. Or the scheme will never be powerfull enough to work.
There are obviously many other problems with Jefsey. I just wanted to quote some. A last worrying one: he is impressed by technical pertinence and vision, not by corporate positions.
Now, I explained the why, may be interesting to discuss the bad/good issue.As often in period of dramatic industrial changes, industries fight for survival in making incorrect short-sighted choices based upon immediate self-protection (innovative quick consensus is quite unusual). The way to best help them is to accompany their change, in first trying to understand its reasons. Often, as in this case, the change proceed in several phases due to the hysteresis of the interaction with technology and users: timing is important. In fighting my graduated propositions to carefully move from one step to an other, the Draft affinity group actually represents a huge danger for the industry it wants to protect. They waste time and modern solutions in fighting the last war. One can easily see that by the lack of definition of the terms they use: it would show they confuse the contexts. As a consequence their resulting proposition will not be acceptable to their competition, to their market partners, to governments etc. and is of no real interest to their "end-users" (an old notion they have not yet departed from).
It is not far. But it is not acceptable. A simple explanation is that they start from where they come from (typographers) instead of considering where they want to arrive (distributed networks). Also, that they have a wrong economical model. Their model is to reduce costs through simplification, what means rigidity. This could work if the market was dwindling. But the market is exploding. But not exploding in the way they see it (publishing/rending stocked books), but in the way the users see it (multimedia access to knowledge). Scalability and simplicity are confused with limitation: they do not mean less options; they mean an infinite number of easy to get options and only a few built-in defaults.
I must say they are not helped by the current status of the world thinking. The WSIS is a major misconception for them. "Information" is currently understood as a mix between technical data and French 1960 "informatique". The WSIS was initially a first attempt by librarians to address their problem. It went out of their control. For the same reason why langtags are to escape them. The problem is new, complex and a past protection approach can only be increasingly inadequate. What they need is a solid, open, stable, innovative and secure generalised network framework, where they can lodge their own solutions. Unfortunately the Internet does not fully deliver it, yet. This is why before addressing the book industry needs the International Network system must first become a Multilingual Global NGN. The advantage of Majors over Unicode is that Music is a unique universal languages. But pictures are multilingual - with a larger possible diversity of languages than texts. MPEG should be included in the analysis.
The solution IMHO is to proceed step by step. Considering immediate interests, ending with exclusion but carefully discussing timing, experimentation and deployment areas. Until one can agree on a future economical model. We can discuss that too.
It will necessarily include the convergence of various forms of data-providers (search engines, domain names, libraries, mail, etc.), of content structures (architexts, multimedia, services), storing (memorisation technologies are a key issue), o payment means (probably a secure micro payment support). etc. Then the common standard MUST address the need of EVERYONE. Because it is legal, the IETF objective, but most of all because it is business efficient in a network.
And, what if Jefsey finds a trick to proceed otherwise? Take care. jfc _______________________________________________ Ltru mailing list Ltru at lists.ietf.org https://www1.ietf.org/mailman/listinfo/ltru
Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.