<burden> :of "proof" .
In XML, the burden is on the person with the query to figure out how the elements and attributes in one XML file relate to the elements and attributes in another. Glue-code has to be programmed to mesh the data. With RDF, the burden is on the people with the data to ensure that their identifiers for things overlap with other data sources. The difficulty in RDF is more of a design decision, and design decisions are tough too.
—Joshua Tauberer, GovTrack.us, Public Data, and the
Semantic Web
This blog post is full of links.
#burdenAll the News That's Fit to Describe
Or : How to Read the New York Times in 5 Minutes or Less
The daily dumps plotting the relationships, and
geographies, of New York Times articles have
returned. Version 2.0
does not contain any scrumjax
but consolodates a number of different views that used to live at
different URLs in to one place.
None of this is rocket science and, frankly, I'm stunned
that the Times hasn't already done something like this
themselves. I wouldn't suggest that they replace their
homepage but I find it incredibly useful to see what's going on
and get a feel for the pulse of
The Man.
The dumps are rendered, primarily, as RDF for this simple reason :
(rdf * rdf) = rdf
. That said the RDF is also
transformed, and presented on the web, as XHTML with
microformats inside
. Translation : I think you're so
very wrong but that doesn't mean I don't love you anyway.
Daily indexes are archived in a simple
YYYY/MM/DD
series of subdirectories. Older
versions currently located at nytimes/knows/related/
will stick around, or be
tickled with mod_rewrite-fu. The other stuff,
notably the Google knows
page, will probably be
deleted because they are kind of stupid and don't really
serve any purpose other than to fill up disk space. The best
part about the Google pages, for instance, was having them
show up as the first or second query result for the corresponding phrase.
There are a couple other ideas floating around, time and
resources permitting. In the meantime, if someone wants to
deal with writing an XSLT stylesheet to generate JSON from
either the RDF or the XHTML, before I do, that would be
grand. Personally I think that the time would be better
spent writing and lobbying browser developers for a
SafeXMLHttpResponse
JavaScript method but
that's a discussion best saved for another day and working
code always wins.
Finally, I hope that there is someone at the Times, and other news organizations (insert obligatory weblogging as journalism meme here), who understands what they've got and its value both to themselves and the intarweb in general.
This blog post is full of links.
#nytUpcoming : A long and twisty rant about XPath in Python
pyupcoming is a simple Python interface to the Upcoming.org REST API.
It does not auto-create methods for the API or try to render the data returned by Upcoming into any kind of Pythonic model. It does allow you to query the results of an API call using XPath. Sort of.
Sort of, in the sense that you can query stuff using XPath but you'll get back an elementtree object rather than a proper XML-ish object with its own DOM functionality.
I could have used libxml but that introduces a whole other world of dependencies. I wanted a library that could easily run on a variety of platforms with Python support. I started out with another more Pythonic interface to the Upcoming API and then spent most of a morning trying to shoehorn in features that I needed before getting lost in a twisty maze of
__getattr__
functions and giving up.So, this is the 80. Patches for the remaining 20 are welcome.
This blog post is full of links.
#pyupcomingSubject: lightbox.js “pass through” URL patch
Date: 2006-01-27 08:31
To: Lokesh Dhakar
Hi,
I finally got around to playing with lightbox the other day. I have little thumbnails, on my weblog, of recently taken and recently favourited photos that are scrumjaxed from Flickr.
The one problem with lightbox is (was) that it only allows you to link to the large image. I wanted (partly so as not to violate the Flickr TOS :-) to be able to click on the large image and have the browser go to the actual photo page on Flickr.
So I wrote a patch. Instead of:
<a href=large.jpg
rel=lightbox
>
<img src=small.jpg
/>
</a>
You write:
<a href=http://www.flickr.com
alt=large.jpg
rel=lightbox
>
<img src=small.jpg
/>
</a>
This has the advantage of making href URLs work both when lightbox is enabled or when JavaScript is disabled or otherwise borked.
I am not wed to using the alt
tag. It just seemed the most appropriate
given the available options.
The patch should be backwards compatiable and Just Work (tm) with people using the first style.
I neglected to make a .dist
copy of the source before mucking on it and
it appears that you've made additional changes since then. Based on the diff
output it seems like you have just removed unrelated code.
Anyway, it works for me so I thought you might be interested in the patch.
This blog post is full of links.
#patchbox