For this experiment, we've been beta-testing an app called AutoEdit, developed by Pietro Passereli during his Mozilla Fellowship at Vox Media. The application is open source and provides an impressively full-featured front-end on top of IBM's Watson Speech-to-Text API.
Here's a video showing the applications capabilities, which include the ability to edit the text of a transcription and export an EDL with key soundbites for Premiere.
IBM's API offers speech-to-text transcription at a cost of 2¢ per minute of audio. The first 1,000 minutes of audio free each month is free. That is a significant savings compared to human-powered transcription services like transcribeme.com which cost ¢79/min and up, and even commercial hybrids like Trint which cost 25¢/min
When you launch it for the first time you'll be asked for an IBM BlueMix username and password.
You can either obtain your own API Keys from IBM BlueMix or if you're an employee of McClatchy, you can use my API key for limited testing purposes.
Quality of Transcription
Here is a sample clip and the accompanying transcription.
Correctly transcribed text Error{Corrected Text} etc...
IBM Speech-to-text
"hi thanks for calling voters make a {the} call you here in Africa {hear enough from us} we want to hear your thoughts on the two thousand sixteen presidential election Wheatley pressure {Please leave us your} name where you're from and why or why not you're going to go{vote} and if you're voting you{who} will you vote for and why
I'm voting for Hillary Clinton because trump is a crook either so I stir in a school in{He's a shyster and he's foolin' } a lot of the American people I am voting for trump
I don't care for Hillary pointed{for the} corruption the war crimes the fry yeah{fraud, the} foundations Russia{I'm not sure what I'm gonna do} I don't know do I have a hard time voting for trump I think he's a fool but I might vote for him just to throw the whole system in the basket"
Google Speech-to-text (TBC)
Unable to Translate First Voice {Hi thanks for calling voters make the call you hear enough from us we want to hear your thoughts on the two thousand sixteen presidential election Please leave us your name where you're from and why or why not you're going to vote and if you're voting you{who} will you vote for and why
I'm Voting for Hillary Clinton because Trump is a crook he's a shyster they still out of{and he's foolin' a lot of} the American people
I am voting for Trump I don't care for himor {Hillary, for} the corruption the war crimes the Frog{Fraud}{the} Foundation{s}. {I'm not sure what I'm going to do} I have a hard time voting for Trump and{I think he's a fool but} I might go for Injustice on {vote for him just to throw} the whole system in the basket.
Next Steps
Distribute to more people for testing
Rate accuracy
Try other Speech to Text services
Feedback
So far we haven't had anyone choose to use the application in practice.
The setup process was challenging on previous versions of AutoEdit
The accuracy of the transcription is hit-or-miss depending on the quality of audio and accents.
It also is not very good with proper nouns
The Sandbox team had to be directly involved in previous tests, making timing a challenge.
Speech to Text transcriptions don't handle punctuation at all.