Increasing OER discoverability by improved keyword metadata via automatic speech to text transcription

Can we use Automatic Speech to Text Technology to generate better cataloguing of open media content ?

SPINDLE brings together the media and podcasting team at OUCS with experts in linguistics and speech technology from the Phonetics Laboratory to work on the University’s growing collection of video and audio podcasts of lectures. The project will use speech-to-text technologies to automatically create transcripts of lectures in order to generate keywords which will help with the indexing, description and discovery of the lectures.

The team will also investigate further potential uses of the transcripts, such as screen-reading and full-text search, and will build demonstrators of synchronised subtitling, as well as writing blogs on the various technical options, standards, barriers, problems, and work-arounds, to assist other institutions which may be considering developing similar or related services.

SPINDLE will create linguistic analysis tools to filter uncommon spoken words from the automatically generated word-level transcriptions that will be obtained using Large Vocabulary Continuous Speech Recognition (LVCSR) software. SPINDLE will use this analysis to generate a keyword corpus for enriching metadata, and to provide scope for indexing inside rich media content using HTML5. Intracontent indexing also allows for the richer scope for the use of associated OER resources, thus turning the tool into a potential auto-remixer.

University of Oxford podcasts are freely available at http://podcasts.ox.ac.uk/

Spindle Project reports and blog posts - http://blogs.oucs.ox.ac.uk/openspires/category/spindle/


A. Oxford Free Speech Debate lecture with Jimmy Wales ( Software used: Adobe Premiere Speech to Text plugin)

External Tutorials in synchronised subtitling in HTML 5 ( IE10 and Chrome) - http://www.html5rocks.com/en/tutorials/track/basics/

