this is aaronland

ouroboros's coattails

the purpose of open data is to save us from ourselves

One day I am going to write about the problem of museum website footers...

Some time in the last couple of weeks the digital team at the Cooper Hewitt disabled the "random object" page on its collection website as well as the corresponding cooperhewitt.objects.getRandomObject API method. This follows the disabling of the search by colour webpages last year, even though they were consitently the most popular way to browse the collection.

While this is disappointing to me personally, since I implemented these features and many of us thought they were both a fun and genuinely useful way to introduce visitors to the collection, these decisions are the discretion of the museum and the current digital team. They are free to do whatever they want.

What is genuinely disappointing to me, though, is the fact that the API methods were disabled without any notice or warning. We have grown accustomed, in the last decade, to the practice of companies and platforms changing or disabling their services with little or no regard to the external users of their services. One only has to look at the flip-flopping around who can use which parts of the Twitter API, and under what conditions, since it was first launched in 2006.

To the extent that the cultural heritage sector, and specifically museums, have a long history of riding the coattails of the private sector when it comes to technology decisions the fact that we're starting to see similar behaviour when it comes to our own APIs should be no surprise.

In fairness, the terms of service for the Cooper Hewitt API state:

This Application Programming Service (API) is provided as-is and you agree to play nicely and not abuse abuse it and that we, the API service provider, reserve the right to change things at our discretion with or without notice. This includes disabling access to the API if your applications starts misbehaving.

Basically: Play nicely and don't be creepy.

I wrote that text sometime in 2012 or 2013 and it really should have been revised, when the museum reopened in 2014, to offer a little more clarity and guarantees around the service. Live and learn. In 2019, speaking at the Museums and the Web conference about the work SFO Museum is doing on the Mills Field website I said:

We have changed the order of things to publish the open data representation first and then, from there, to build our own websites and services on top of that.

Everything I've described so far has been built using the same raw materials that we've made available for you to do something with. This introduces a non-zero cost in the build process for the public-facing museum efforts but we believe it's worth the cost.

But why, right?

First of all we want other people to build new interfaces and new services, new "experiences" even, on top of our collection so this is a way to keep ourselves honest. If we can't build something with this stuff why should we imagine you will?

Second, we want to ensure that the data we release and the manner in which it is published, is actually robust and flexible enough to engender a variety of interfaces and uses because we need that variety. It is important to the museum because I don't believe there is, or should be, only one master narrative in to the collection.

There is a long and celebrated tradition in the museum sector of digital teams scraping their own websites to provide unofficial API access to their own collections. While this kind of can do spirit is laudable it is mostly, I think, a damning indictment of the sector as a whole. This sort of thing shouldn't be necessary and the fact that it is shows either that we've learned nothing since the idea of public service-level API methods became commonlace or, worse, that we were never serious about open and accessible APIs in the first place.

The immediate consequence of the Cooper Hewitt disabling the cooperhewitt.objects.getRandomObject API method is that it breaks the wunderkammer application that I've been working on since June of this year. As consequences go that's pretty low. I am probably the only person using the application. If the wunderkammer application were distrubuted through the app store with users, maybe even lots of users, the consequences would be pretty high.

Micah Walter's Cooper Hewitt Random Button Pinterest board

Best practice dictates that you should always write your programs defensively when using third-party APIs, assuming for one reason or another that they will eventually fail. Calling the cooperhewitt.objects.getRandomObject or any other API method should be considered a brittle operation and ideally an application will trap those failures accordingly. That's very different than suddenly introducing a deliberate and consistent failure in to one or more applications which may have lengthy or burdensome update procedures and that require user-action (downloading a new application) to remedy the situation.

It's just rude. There might be a valuable debate to be had about whether or not rudeness is actually a core value in the cultural heritage sector but, at least publicly, we say it's not and I believe that same commitment should extend to the things we do digitally.

My argument here is not that I have been adversely or negatively impacted, in any serious way, by the Cooper Hewitt's decision to remove a single API method. As I mentioned scraping museum websites and building bespoke databases is not very complicated. I had to do that recently, compiling all the image URLs for objects published in The Metropolitan Museum of Art's Open Access Initiative dataset. I think I could do the same for the Cooper Hewitt collection in an afternoon. My argument is that I shouldn't have to, especially not after the fact.

Which brings me to the point of writing this blog post in the first place: I don't have to.

Much, but not all, of the Cooper Hewitt collection is published as part of the Smithsonian's own Open Access metadata initiative. Since the wunderkammer application already supports local databases of Smithsonian objects I can simply remove all the custom code to deal with the Cooper Hewitt API and continue browsing random objects like I would any other Smithsonian collection.

That's the good news. I also can't help but feeling like, at the same time, this is really the bad news.

Whatever the theory says about the value of cultural heritage institutions publishing open data, the practice appears to be one where the immediate benefit is shielding people from the constant indecision, churn and lack of long-term commitment in the cultural heritage sector itself.

We have changed the order of things to publish the open data representation first and then, from there, to build our own websites and services on top of that.

We can and we should do better than having to build in safeguards to protect the things we do as a sector, for our own institutions and the public that we claim and some of us are mandated to serve, from ourselves. There should be no celebration in that, only disappointment.