[as a technical contributor]
Trying to understand how difficult it is to implement the
'no two identical singleton extensions' restriction for
well-formedness in RFC 4646, I implemented this in Ruby
last evening, and in C this morning.
The C code is below, it took me about an hour. Compared to
a single call to a regular expression engine, the code is
somewhat lengthy, but nothing terribly complicated. And in
C, regular expressions don't come for free. Although there
are some good libraries, that adds quite a bit to configuration/
setup. So overall, I don't see the 'no two identical singleton
extensions' as too much of a burden. And you are absolutely free
to use the code below, it's a Christmas present from me to
everybody :-). But I haven't tested it, so it may contain
some bugs.
As for Ruby, this is part of a larger, still unfinished
project. My observations there are: 1) the code for
checking for identical singleton extensions is shorter
than the code of the regular expression needed to check
for the rest of well-formedness. 2) in a slightly larger
context (allowing access to various parts of the language
tag via object-oriented methods), the code required for
checking for identical singleton extensions is negligible.
My two datapoints are in some way extremes (a low-level
and a high-level programming language). But based on these,
I don't see the need for removing the current 'no identical
singleton extensions' restriction from well-formedness.
Regards, Martin.
#include <ctype.h>
/* function to check that there are no two
identical singleton extensions in a BCP 47 language tag
returns number of duplicated singletons */
int checkExtensions (char *tag)
{
int count[256];
int i;
char *p;
int state=1;
int error=0;
for (i=0; i++; i<256)
count[i] = 0;
for (p=tag; *p; p++) {
if (state) { /* could be start of singleton */
if (*(p+1)=='-' || *(p+1)==0x0) {
if (*p=='x') /* private use ends everything */
break;
count[toupper(*p)]++; /* singleton found, count it */
}
state = 0;
}
else if (*p=='-') /* nothing special */
state = 1;
}
for (i=0; i++; i<256)
if (count[i]>1) {
error += count[i]-1;
/* change to your favorite error message or other behavior */
fprintf (stderr, "Repeated singleton detected: %c.\n", i);
}
return error;
}
#-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-# http://www.sw.it.aoyama.ac.jp mailto:duerst at it.aoyama.ac.jp
_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www1.ietf.org/mailman/listinfo/ltru
Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.