How to Run OpenAI Whisper in Google Colab
Ross O'Connell
OpenAI's Whisper is an exciting new model for automatic speech recognition (ASR). It features a simple architecture based on transformers, the same technology that drove recent advancements in natural language processing (NLP), and was trained on 680,000 hours of audio from a wide range of languages. The result is a new leader in open-source solutions for ASR.
The researchers at Deepgram have enjoyed testing Whisper and seeing how it works, and we wanted to make it as easy as possible for you to try it out too. One of the things we've learned in our experiments is that, as with many deep-learning tools, Whisper performs best when it has access to a GPU. While downloading and installing Whisper may be straightforward, configuring it to properly utilize a GPU (if you have one!) is a potential roadblock.
Google Colab provides a great preconfigured environment for trying out new tools like Whisper, so we've set up a simple notebook there to let you see what Whisper can do. We set up the notebook so that you don't need anything extra to run it, you can just click through and go. The notebook will:
Install Whisper
Download audio from YouTube
Transcribe that audio with Whisper