$500 bounty for a speech transcription program.
The world needs an API to automatically generate transcript captions for videos. I am offering a $500 bounty for a program that does this via YouTube’s built-in machine transcription functionality. It should work in approximately this manner:
- Accepts a manifest that lists one or more video URLs and other metadata fields. The manifest may be in any common, reasonable format (e.g., JSON, CSV, XML).
- Retrieves the video from the URL and stores it on the filesystem.
- Uploads the video to YouTube, appending the other metadata fields to the request.
- Deletes the video from the filesystem.
- Downloads the resulting caption file, storing it with a unique name that can be connected back to a unique field contained within the manifest (e.g., a unique ID metadata field).
Rules
- Must be written in a common, non-compiled language (e.g., Python, PHP, Perl, Ruby) that requires no special setup or server configuration that will run on any standard, out-of-the-box Linux distribution.
- Must run at the command line. (It’s fine to provide additional interfaces.)
- May have additional features and options.
- May use existing open source components (of course). This is not a clean-room implementation.
- May be divided into multiple programs (e.g., one to parse the manifest and retrieve the specified videos, one to submit the video to YouTube, and one to poll YouTube for the completed transcripts), or combined as one.
- Must be licensed under the GPL, MIT, or Apache licenses. Other licenses may be considered.
- If multiple parties develop the program collaboratively, it’s up to them to determine how to divide the bounty. If they cannot come to agreement within seven days, the bounty will be donated to the 501(c)3 of my choosing.
- The first person to provide functioning code that meets the specifications will receive the bounty.
- Anybody who delivers incomplete code, or who delivers complete code after somebody else has already done so, will receive a firm handshake and the thanks of a grateful nation.
- If nobody delivers a completed product within 30 days then I may, within my discretion, award some or all of the bounty to whomever has gotten closest to completion.
Participants are encouraged to develop in the open, on GitHub, and to comment here with a link to their repository, so that others may observe their work, and perhaps join in.
This bounty is funded entirely by the 95 folks who backed this Kickstarter project, though I suppose especially by those people who kept backing the project even after the goal was met. I deserve zero credit for it.
11 Comments