It took just 27 hours for the $500 speech transcription bounty to be claimed. Aaron Williamson produced youtube-transcription, a Python-based pair of scripts that upload video to YouTube and download the resulting machine-generated transcripts of speech. It took me longer to find the time to test it out than it did for Aaron to write it. But I finally did test it, and it works quite well.
There are lots of changes and features that I’d like to see, and the beauty of open source software is that those changes don’t need to be Aaron’s problem—I (and anybody else) can make whatever changes that I see fit.
This will be pressed into service on Richmond Sunlight ASAP. Thanks to Matt Cutts for the idea, and to the 95 people who backed this project on Kickstarter, since they’re the ones who funded this effort.
Thanks for sharing so quickly the news with us !
I’ve been downloading the xml files which youtube generates manually and then converting them using this: https://gist.github.com/mbarkhau/2563730
Comments are closed.