Doug Ewell wrote:
Just a few minor nits here. This is slightly more complex than meets the eye, unfortunately.
Yep.
- singletons in the first position (except for 'x' and the grandfathered list)Sadly, the existence of the grandfathered list means that all well-formed processors must also do a limited amount of validity checking. I don't question the importance of maintaining support for the grandfathered tags, but this is a side effect.
Agreed: all RFC 3066bis processors have to have the list of grandfathered tags baked in.
- missing subtag ("--") - a dangling hyphen ("foo-bar-baz-") or initial hyphen ("-foo-bar-baz")The second is really just a special case of the first: a missing subtag at the end or beginning, respectively. One thing I found useful, when building my validator, was to parse out the subtags first and check them for validity afterward, so the hyphens never become part of the validity checking per se.
I did the same thing. However, one must check the hyphens. Tokenizers sometimes do not return "empty" tokens and can miss these cases.
"ab-x-abc-x-abc" // anything goes after xNot quite anything, of course: 1*("-" (1*8alphanum))
Yes. At least one alphanumeric subtag must follow 'x' and it cannot exceed eight characters in length.
Addison -- Addison Phillips Globalization Architect -- Yahoo! Inc. Internationalization is an architecture. It is not a feature. _______________________________________________ Ltru mailing list Ltru at ietf.org https://www1.ietf.org/mailman/listinfo/ltru
Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.