[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Ltru] Re: Inversions and other problems
I generated a file at http://macchiato.com/ltru/ltru-inversions.txt. It contains a mapping from all of the descriptions (sans parentheticals) to their uninverted forms. It also *adds* cases that are "possibly missing", marked with @MISSING. The file, as usuall, is tab delimited, so best pulled into a program that lines them up into columns, like Excel or equivalent.
The "possibly missing" inversions are produced by collecting all of the prefixes and suffixes of existing inversions, and seeing if any of the uninverted string matches such an existing prefix or suffix. For example, if there is some preexisting inversion such as "Armenian, X" in the 01 data file, and the program sees "Y Armenian", then it includes "Armenian, Y @MISSING" for review in this file.
There are 251 of these items. So that we have consistent inversion, these should be reviewed before we finalize the descriptions. For example, in the list we see:
103 Armenian, Classical Classical Armenian
104 Armenian, Eastern Eastern Armenian @MISSING
105 Armenian, Middle Middle Armenian
106 Armenian, Western Western Armenian @MISSING
#104 and #106 are missing, meaning that we don't invert those (although, IMO, clearly we should if we are doing inversion).
The whole list could probably stand review -- it exposes some clear spurious differences, eg "Alaska" vs "Alaskan" in the following:
31 Alaska Inupiatun, Northwest Northwest Alaska Inupiatun @MISSING
32 Alaskan Inupiatun, North North Alaskan Inupiatun @MISSING
Mark
_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www1.ietf.org/mailman/listinfo/ltru
Note Well: Messages sent to this mailing list are the opinions
of the senders and do not imply endorsement by the IETF.