<burden> :of "proof" .

In XML, the burden is on the person with the query to figure out how the elements and attributes in one XML file relate to the elements and attributes in another. Glue-code has to be programmed to mesh the data. With RDF, the burden is on the people with the data to ensure that their identifiers for things overlap with other data sources. The difficulty in RDF is more of a design decision, and design decisions are tough too.

—Joshua Tauberer, GovTrack.us, Public Data, and the Semantic Web

All the News That's Fit to Describe

Or : How to Read the New York Times in 5 Minutes or Less

The daily dumps plotting the relationships, and geographies, of New York Times articles have returned. Version 2.0 does not contain any scrumjax but consolodates a number of different views that used to live at different URLs in to one place.

None of this is rocket science and, frankly, I'm stunned that the Times hasn't already done something like this themselves. I wouldn't suggest that they replace their homepage but I find it incredibly useful to see what's going on and get a feel for the pulse of The Man.

The dumps are rendered, primarily, as RDF for this simple reason : (rdf * rdf) = rdf. That said the RDF is also transformed, and presented on the web, as XHTML with microformats inside. Translation : I think you're so very wrong but that doesn't mean I don't love you anyway.

Daily indexes are archived in a simple YYYY/MM/DD series of subdirectories. Older versions currently located at nytimes/knows/related/ will stick around, or be tickled with mod_rewrite-fu. The other stuff, notably the Google knows page, will probably be deleted because they are kind of stupid and don't really serve any purpose other than to fill up disk space. The best part about the Google pages, for instance, was having them show up as the first or second query result for the corresponding phrase.

There are a couple other ideas floating around, time and resources permitting. In the meantime, if someone wants to deal with writing an XSLT stylesheet to generate JSON from either the RDF or the XHTML, before I do, that would be grand. Personally I think that the time would be better spent writing and lobbying browser developers for a SafeXMLHttpResponse JavaScript method but that's a discussion best saved for another day and working code always wins.

Finally, I hope that there is someone at the Times, and other news organizations (insert obligatory weblogging as journalism meme here), who understands what they've got and its value both to themselves and the intarweb in general.

Upcoming : A long and twisty rant about XPath in Python

pyupcoming is a simple Python interface to the Upcoming.org REST API.

It does not auto-create methods for the API or try to render the data returned by Upcoming into any kind of Pythonic model. It does allow you to query the results of an API call using XPath. Sort of.

Sort of, in the sense that you can query stuff using XPath but you'll get back an elementtree object rather than a proper XML-ish object with its own DOM functionality.

I could have used libxml but that introduces a whole other world of dependencies. I wanted a library that could easily run on a variety of platforms with Python support. I started out with another more Pythonic interface to the Upcoming API and then spent most of a morning trying to shoehorn in features that I needed before getting lost in a twisty maze of __getattr__ functions and giving up.

So, this is the 80. Patches for the remaining 20 are welcome.

From: Aaron Straup Cope
Date: 2006-01-27 08:31
To: Lokesh Dhakar

Hi,

I finally got around to playing with lightbox the other day. I have little thumbnails, on my weblog, of recently taken and recently favourited photos that are scrumjaxed from Flickr.

The one problem with lightbox is (was) that it only allows you to link to the large image. I wanted (partly so as not to violate the Flickr TOS :-) to be able to click on the large image and have the browser go to the actual photo page on Flickr.

So I wrote a patch. Instead of:


 <a href=large.jpg rel=lightbox>
  <img src=small.jpg />
 </a>

You write:


 <a href=http://www.flickr.com alt=large.jpg rel=lightbox>
  <img src=small.jpg />
 </a>

This has the advantage of making href URLs work both when lightbox is enabled or when JavaScript is disabled or otherwise borked.

I am not wed to using the alt tag. It just seemed the most appropriate given the available options.

The patch should be backwards compatiable and Just Work (tm) with people using the first style.

I neglected to make a .dist copy of the source before mucking on it and it appears that you've made additional changes since then. Based on the diff output it seems like you have just removed unrelated code.

Anyway, it works for me so I thought you might be interested in the patch.

this is aaronland

Patches for the remaining 20 are welcome.

<burden> :of "proof" .

All the News That's Fit to Describe

Upcoming : A long and twisty rant about XPath in Python

Subject: lightbox.js “pass through” URL patch