Date: Mon, 28 Oct 2002 08:24:08 -0500 (EST) From: Aaron Straup Cope To: Bill Kearney Subject: Re: dc language in rss On Fri, 25 Oct 2002, Bill Kearney wrote: > That would indeed be a problem. You could actually mark up those sections, even > down to the paragraphs or even words with span tags. I shudder at the thought > of what most environments would DO with that data, but it's certainly possible. If I were a better person, I(would(learn(lisp))) and write an Emacs minor-mode to do that. (Sadly(,(lisp(scares(me))))). > Well, the problem is what does that element mean? What purpose is it being used > for? I daresay outside of Syndic8's listing of feeds by language, not much is > paying attention to it. So my question to you is what would you have a reader > program DO with multiple languages? The short answer is : I have no idea. The longer answer is : Who cares? There are two issues here : The first falls into the Foofy Grand Unifying Principles category - the people who invented the Internet didn't know what it was going to be used for. Why should RSS, and its tool set, presume the samething as basic and often controversial as language? The second falls into the Dueling Shakespeare category - RFC 1766 states that : "In some contexts, it is possible to have information in more than one language, or it might be possible to provide tools for assisting in the understanding of a language (like dictionaries). "A prerequisite for any such function is a means of labelling the information content with an identifier for the language in which is is written." But in the absense of multiple language tags, the correct answer when prigs like me start pussing is : <quote src = "rfc1766"> The information in the subtag may for instance be: - Country identification, such as en-US (this usage is described in ISO 639) - Dialect or variant information, such as no-nynorsk or en- cockney - Languages not listed in ISO 639 that are not variants of any listed language, which can be registered with the i- prefix, such as i-cherokee - Script variations, such as az-arabic and az-cyrillic </quote> Which doesn't solve everyone's problem, but can be adapted to deal with the problem of Quebec. I chose en-quebecois, because I like the sound of it. Sovereigntists, on the other hand will probably opt for 'en-qc' since it implies nationhood. Then, of course, there is the question of how to deal with representing a weblog written by the province's allophone population (translation: persons whose mother tongue is neither English nor French and who, in my limited experience, often speak upward of 4-6 languages). What then? qc-allophone?