“Accidental APIs”: Naming a design pattern.

Like many open data developers, I’m sick of scraping. Writing yet another script to extract data from thousands of pages of HTML is exhausting, made worse by the sneaking sense that I’m enabling the continuation of terrible information-sharing practices by government. Luckily, it’s becoming more common for government websites to create a sort of an …

Dynamic electrical pricing demands dynamic price data.

The power industry has begun its long-anticipated shift towards demand-based pricing of electricity. Dominion Power, my electric company here in Virginia, has two basic rates: winter and summer. Although the math is a bit complicated, electricity costs about 50% more in the summer than in the winter, averaging 12¢ per kilowatt hour. (One can also pay for sustainably …

Opening up Virginia corporate data.

In Virginia, you can’t just get a list of all of the registered corporations. That’s not a thing. If you dig for a while on the State Corporation Commission’s website, you’ll find their “Business Entity Search,” where you can search for a business by name. But if you want to get a list of all …

A Virginia campaign finance API.

Last year, I wrote here that I was working on an open-source campaign finance parser for Virginia State Board of Elections data. Thanks to the good work of the folks at the SBE, who are making enormous advances in opening up their data, I’ve been able to make some great progress on this recently. That …

New site, new datasets.

Since creating Richmond Sunlight and Virginia Decoded, I’ve been building up a public trove of datasets about Virginia government: legislative video, the court system’s definitions of legal terms, court rulings, all registered dangerous dogs, etc. But they’re all scattered about on different websites. A couple of years ago, I slapped together a quick site to …

$500 speech transcription bounty claimed.

It took just 27 hours for the $500 speech transcription bounty to be claimed. Aaron Williamson produced youtube-transcription, a Python-based pair of scripts that upload video to YouTube and download the resulting machine-generated transcripts of speech. It took me longer to find the time to test it out than it did for Aaron to write …

$500 bounty for a speech transcription program.

The world needs an API to automatically generate transcript captions for videos. I am offering a $500 bounty for a program that does this via YouTube’s built-in machine transcription functionality. It should work in approximately this manner: Accepts a manifest that lists one or more video URLs and other metadata fields. The manifest may be …

Request for Awesome.

I was lucky enough to spend last week at the Aspen Institute, attending the annual Forum on Communications and Society. Thirty-odd of us spent four days talking about how to make government more open and more innovative. The guest list will leave reasonable people wondering how I got invited—Madeline Albright, Toomas Hendrik Ilves (the President …