[Ltru] Language tags and (localization) processes (Re: draft-davis-t-langtag-ext)

Felix Sasaki <felix.sasaki@fh-potsdam.de> Tue, 12 July 2011 07:23 UTC

Return-Path: <felix.sasaki@googlemail.com>
X-Original-To: ltru@ietfa.amsl.com
Delivered-To: ltru@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7B38521F8EF8 for <ltru@ietfa.amsl.com>; Tue, 12 Jul 2011 00:23:39 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.192
X-Spam-Level:
X-Spam-Status: No, score=-1.192 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_FONT_FACE_BAD=0.884, HTML_MESSAGE=0.001, J_CHICKENPOX_34=0.6, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qc3JSovooodG for <ltru@ietfa.amsl.com>; Tue, 12 Jul 2011 00:23:38 -0700 (PDT)
Received: from mail-qy0-f172.google.com (mail-qy0-f172.google.com [209.85.216.172]) by ietfa.amsl.com (Postfix) with ESMTP id B28D921F8FF9 for <ltru@ietf.org>; Tue, 12 Jul 2011 00:23:37 -0700 (PDT)
Received: by qyk9 with SMTP id 9so2060530qyk.10 for <ltru@ietf.org>; Tue, 12 Jul 2011 00:23:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=mime-version:sender:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type; bh=4CVPHMuzw0+qbBgKeLP6svLB2djp+Ze9OMn2F6Yi8u4=; b=fQgQDqqHHHw9LbdoYcGQ9IsH4pKMEJb6s3A3W84Xn14xfAXsinb2n5kgabUrvWNRUy 6Aow8VnLAlAFV5oK2kjGB/delVp9jmOdnOeGKgqLOZAn0PAEI7sAqyePrb43h3Z5WusS k2lT0Pw7gBLHqyzzd+BAFr8lwrJLrkLSUXBTU=
MIME-Version: 1.0
Received: by 10.224.182.67 with SMTP id cb3mr3809647qab.276.1310455416791; Tue, 12 Jul 2011 00:23:36 -0700 (PDT)
Sender: felix.sasaki@googlemail.com
Received: by 10.224.45.210 with HTTP; Tue, 12 Jul 2011 00:23:36 -0700 (PDT)
Date: Tue, 12 Jul 2011 09:23:36 +0200
X-Google-Sender-Auth: gKhlIIJR_JDjM9GI1l4bdGWh7lU
Message-ID: <CAL58czptZA+pRi4HYW8J0cAn7vSw=MM-N6193uzi7HG=2sRdBw@mail.gmail.com>
From: Felix Sasaki <felix.sasaki@fh-potsdam.de>
To: Mark Davis ☕ <mark@macchiato.com>
Content-Type: multipart/alternative; boundary="20cf302ef8cc9ca1cf04a7da2d05"
Cc: ietf-languages@iana.org, ltru@ietf.org
Subject: [Ltru] Language tags and (localization) processes (Re: draft-davis-t-langtag-ext)
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 12 Jul 2011 07:23:39 -0000

The current draft states

"Language tags, as defined by
[BCP47<http://tools.ietf.org/html/draft-davis-t-langtag-ext-02#ref-BCP47>],
are useful for identifying the

   language of content.  There are mechanisms for specifying variant
   subtags for special purposes.  However, these variants are
   insufficient for specifying text transformations, including content

that has been transliterated, transcribed, or translated."

I am requesting a clarification from the editors, that includes a liaison
with the Unicode ULI TC http://uli.unicode.org/ , and a clarification in the
draft.

Language tags so far have described *states*: an object is in a language, a
script etc. The proposed extension extends languages to describe the outcome
of a *process*: objects have been transformed, with a source object as the
basis for this process. According to the paragraph above, this
transformation includes also translation.

So far formats like TBX, XLIFF or others have been used for aligning source
and target contents. These formats also use language tags, via xml:lang.
However, the transformation, i.e. the process information, is not expressed
via the language tag, but via XML structures (pairs of source and target
elements). The language tags are purely for identifying the state of an
object.

To avoid confusion for users of the above and other, process related formats
about where to put language identification information and where to put
process related information, I am asking you to
1) Liaise with the ULI TC about the issue described above and see what
issues they see here
2) Document the outcome of this liaison on this list and in the draft
There is no need to have long explanations in the draft, but guidance about
the topic will be very helpful to avoid confusion.

As a side note, formats like TBX, XLIFF and others reduce the usage of a
language tag for good reasons: information related to processes like
translation can be very complex, e.g. expressing translation state, cycle,
quality. So I have the general concern that language tags might be
overloaded with key value pairs in areas that would require more complex
information and that potentially overlap with formats that provide that
information. Nevertheless I won't object against moving this extension
forward, if the concerns are explained properly in the draft.

Felix

2011/7/12 Mark Davis ☕ <mark@macchiato.com>

> We've posted a new version of
> http://tools.ietf.org/html/draft-davis-t-langtag-ext
>
> Diffs are here:
> http://tools.ietf.org/rfcdiff?url2=draft-davis-t-langtag-ext-02.txt
>
> The changes are:
>
> * Made it clear that application to the case of speech was included, added
> Peter C's example.
> * Fixed references, adding authors, removing unneeded reference.
> * Changed ABNF. Mostly just the table form, but also defined alphanum.
> * Made it clear that the CLDR committee must post proposals publicly.
> * Added more information on the XML structure, including the description
> attribute. (Note that the CLDR committee had decided to add the description
> attribute before this process began.)
> * Added fixes for typos noted by CEW.
>
> Please let us know of further feedback.
>
> Note to Doug: The CLDR committee had agreed to move the descriptions into
> the bcp47 files, such as
> http://unicode.org/repos/cldr/trunk/common/bcp47/calendar.xml. Yoshito has
> the action to do that, and was able to accelerate it. So please take a look
> if you have the time.
>
> Mark
>
>
>
> _______________________________________________
> Ltru mailing list
> Ltru@ietf.org
> https://www.ietf.org/mailman/listinfo/ltru
>
>