Like many open data developers, I’m sick of scraping. Writing yet another script to extract data from thousands of pages of HTML is exhausting, made worse by the sneaking sense that I’m enabling the continuation of terrible information-sharing practices by government. Luckily, it’s becoming more common for government websites to create a sort of an accidental API—populating web pages with JSON retrieved asynchronously. Because these are simply APIs, albeit without documentation, this is a far better method of obtaining data than via scraping. There is no standard term to describe this. I’ve been using the phrase “accidental API,” but that’s wrong, because it implies a lack of intent that can’t be inferred. (Perhaps the developer intended to create an API?)
Recently, I solicited suggestions for a better name for these. Here are some of my favorites:
— Bill Hunt (@krues8dr) June 6, 2015
@waldojaquith api through obscurity
— Josh Duff (@TehShrike) June 6, 2015
@waldojaquith TroPI, for Trojan API. Recall that Ajax fought the Trojans.
— V David Zvenyach (@vdavez) June 6, 2015
@waldojaquith UPI: undocumented programming interface (like a UFO)
— Tony Becker (@fortpedro) June 7, 2015
— Philip Shemella (@philshem) June 6, 2015
— Brian ಠ_ರೃ Geiger (@thefoodgeek) June 6, 2015
Undocumented API. Silent API. Public-private API.
— Waldo Jaquith (@waldojaquith) June 6, 2015
The best ones are immediately understandable and don’t ascribe intent on the part of the developer. I suspect I’m going to find myself using Bill Hunt’s “incidental API” and my (and Tony Becker’s) “undocumented API.” I particularly like “undocumented API” because it begins with the assumption of competency on the part of the developer, and that the only shortcoming of the API is its documentation, but I’ll try out a few of them in the coming weeks and see what sticks.