[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Ltru] Re: Test suite for language tags?



On Sat, Aug 05, 2006 at 04:33:38PM +0200,
 Frank Ellermann <nobody at xyzzy.claranet.de> wrote 
 a message of 18 lines which said:

> > The best thing would probably to create four text files
> > containing tags, one per line (may be with comments allowed),
> > with:
>  
> > * two files for well-formedness (OK/broken),
> > * two files for validaty (OK/invalid).
>  
> > So anyone could use it as a test suite, by converting the
> > text files to Junit, PyUnit, HUnit, whatever.
> 
> Maybe it's possible to publish this in a 3066ter appendix,
> or better a separate document (informational RFC).  I often
> miss examples and test cases in other documents.

I agree and here are the two files I currently use (my parser succeeds
for all these tags). I suggest that they should be displayed in a
public place (is there a Web site for LTRU?) and added in a future
RFC.

Corrections and additions are of course very welcome, as well as test
reports from other implementations, so we can be sure we all use the
same language :-)
fr 
fr-Latn
fr-fra # Extended tag
fr-Latn-FR
fr-Latn-419
fr-FR
ax-TZ # Not in the registry, but well-formed
fr-shadok # Variant
fr-y-myext-myext2
fra-Latn # ISO 639 can be 3-letters
fra
fra-FX
i-klingon # grandfathered with singleton
no-bok # grandfathered without singleton
fr-Lat # Extended
mn-Cyrl-MN
mN-cYrL-Mn
fr-Latn-CA
en-US
fr-Latn-CA
i-enochian # Grand fathered
x-fr-CH 
sr-Latn-CS
es-419
sl-nedis
de-CH-1996
de-Latg-1996
sl-IT-nedis
en-a-bbb-x-a-ccc
de-a-value
en-Latn-GB-boont-r-extended-sequence-x-private
en-x-US
az-Arab-x-AZE-derbend
es-Latn-CO-x-private
en-US-boont
ab-x-abc-x-abc # anything goes after x
ab-x-abc-a-a # ditto
i-default # grandfathered
i-klingon # grandfathered
f
f-Latn
fr-Latn-F
a-value
en-a-bbb-a-ccc     # 'a' appears twice 
tlh-a-b-foo
i-notexist          # grandfathered but not registered: invalid, even if we only test well-formedness
abcdefghi-012345678
ab-abc-abc-abc-abc
ab-abcd-abc
ab-ab-abc
ab-123-abc
a-Hant-ZH
a1-Hant-ZH
ab-abcde-abc
ab-1abc-abc
ab-ab-abcd
ab-123-abcd
ab-abcde-abcd
ab-1abc-abcd
ab-a-b
ab-a-x
ab--ab
ab-abc-
-ab-abc
ab-c-abc-r-toto-c-abc  # 'c' appears twice 
_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www1.ietf.org/mailman/listinfo/ltru

Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.