today | current | recent | random ... categories | search ... who ... syndication

Monday, October 28 2002

Subject: Re: dc language in rss




Date: Mon, 28 Oct 2002 08:24:08 -0500 (EST)



From: Aaron Straup Cope



To: Bill Kearney



Subject: Re: dc language in rss







On Fri, 25 Oct 2002, Bill Kearney wrote:







> That would indeed be a problem.  You could actually mark up those sections, even



> down to the paragraphs or even words with span tags.  I shudder at the thought



> of what most environments would DO with that data, but it's certainly possible.







If I were a better person, I(would(learn(lisp))) and write an Emacs



minor-mode to do that. (Sadly(,(lisp(scares(me))))).



> Well, the problem is what does that element mean?  What purpose is it being used



> for?  I daresay outside of Syndic8's listing of feeds by language, not much is



> paying attention to it.  So my question to you is what would you have a reader



> program DO with multiple languages?







The short answer is : I have no idea.







The longer answer is : Who cares?







There are two issues here :







The first falls into the Foofy Grand Unifying Principles category - the



people who invented the Internet didn't know what it was going to be used



for. Why should RSS, and its tool set, presume the samething as basic and



often controversial as language?







The second falls into the Dueling Shakespeare category - RFC 1766 states



that :







"In some contexts, it is possible to have information in more than one



language, or it might be possible to provide tools for assisting in the



understanding of a language (like dictionaries).







"A prerequisite for any such function is a means of labelling the



information content with an identifier for the language in which is is



written."







But in the absense of multiple language tags, the correct answer when



prigs like me start pussing is :







<quote src = "rfc1766">



The information in the subtag may for instance be:







    -    Country identification, such as en-US (this usage is



         described in ISO 639)







    -    Dialect or variant information, such as no-nynorsk or en-



         cockney







    -    Languages not listed in ISO 639 that are not variants of



         any listed language, which can be registered with the i-



         prefix, such as i-cherokee







    -    Script variations, such as az-arabic and az-cyrillic



</quote>











Which doesn't solve everyone's problem, but can be adapted to deal with



the problem of Quebec. I chose en-quebecois, because I like the sound of



it. Sovereigntists, on the other hand will probably opt for 'en-qc' since



it implies nationhood.







Then, of course, there is the question of how to deal with representing a



weblog written by the province's allophone population (translation:



persons whose mother tongue is neither English nor French and who, in my



limited experience, often speak upward of 4-6 languages). What then?



qc-allophone?



refers to

meta

Me : WebService::weblogUpdates.pm 0.34

Added support for the rssUpdate method (XML-RPC only, so far) and a bunch of wonkish niggling in the black box. It will take a while for the CPAN listings to update so, until then, you can grab a copy over here. see also : docs.

refers to

meta

Simon Willison : "I've put together an XML-RPC proxy for the [W3C Validator]."

refers to

meta

Me : WebService::W3C::HTMLValidator.pm 0.1

see also : docs

refers to

meta

Le Québec en images

via afroginthevally

refers to

meta

Mina Naguib : Weather::Underground.pm

Since everyone seems to think I actually care what they're listening to when they post to their weblog, I think I might have to start telling them what the weather's like when I post to mine. Clear and one degree Celcius, in Montreal.

refers to

meta

 
 
Sunday, October 27 2002 ←  → Tuesday, October 29 2002