[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Ltru] xml:lang syntax



Jukka K. Korpela 2008-05-09 07.51:
> Doug Ewell wrote:
>
> > A language tag denotes one language.
>
> For suitable values for "one" and "language", yes.
>
> > A protocol may define ways to
> > combine multiple language tags into one field, as HTTP has done with
> > Accept-Language by separating the tags with semicolons.  But that is
> > defined by and specific to the protocol.
>
> In the XML context, which is what the discussion is about, there is 
> normally no need to consider the possibility of referring to several 
> languages in one xml:lang attribute.

But I may still use 'mul'. So it is allready somehow considered.

>  The reason is that for 
> mixed-language content, XML lets (and even encourages, so to say) you to 
> use nested markup so that you can say exactly, for example, which parts 
> of the text are in English and which are in French. This may require 
> additional markup elements for the sole purpose of attaching xml:lang to 
> some piece of text, but that's nothing odd. For a truly bilingual 
> element (with almost equal amounts of text in two languages), you just 
> need to select one language for it and override the language information 
> in nested elements.
>   

I would rather prefer to say

    <p xml:lang="mul">
        <span xml:lang="fi">22 Pistepirko.</span> 
        <span xml:lang="nn">22  Marihønor.</span>
    </p>

so that I can refer to *the paragraph* via CSS as p:lang(mul){}, rather 
than have to wrap that paragraph in to a div element, each with a 
different language tag,

    <div xml:lang="fi">
    <p xml:lang="nn">
        <span xml:lang="fi">22 Pistepirko.</span>
        <span xml:lang="nn">22  Marihønor.</span>
    </p>
    </div>

which I then would have to refer to/filter out via div:lang(fi) p:lang(nn){}

Then, if in iaddition I could - in a well formed way - could extend the 
'mul' tag to tell which languages it covered, then that would be great.

> However, XML has no way to specify different languages for different 
> _attributes_ of an element. If an element's attribute is in language 
> other than than its content, you can use the workaround of an inner 
> element, just for specifying the language of content, but this won't 
> work for attributes in different languages. The problem is real because 
> nowadays people often put textual data in attributes, as opposite to 
> (what I see as) the original idea in generalized markup where attributes 
> seldom contain text in a human language.
>   

This point about attributes is a very good point.

The alternative to extend 'mul' is to use several tags, in a legal way. 
But as we allready have seen, then we must specify what that list of 
language tags means: Are they there for the audience or for the content?
-- 
leif halvard silli
_______________________________________________
Ltru mailing list
Ltru at ietf.org
https://www.ietf.org/mailman/listinfo/ltru



Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.