Skip to content

Build real time speech2text web apps using OpenAI's Whisper https://openai.com/blog/whisper/

License

Notifications You must be signed in to change notification settings

bcxbb/whisper

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

giant microphone

Whisper Playground

Instantly build speech2text apps in 99 languages using OpenAI's Whisper

visitors

Whisper.Playground.mp4

Contribution ideas

  • Stream audio using web sockets over the current approach of incrementally sending audio chunks
  • Implement diarization (speaker identification) using pyannote-audio (example)

Setup

  1. Whisper requires the command-line tool ffmpeg and portaudio to be installed on your system, which is available from most package managers:
# on Ubuntu or Debian
sudo apt update && sudo apt install ffmpeg
sudo apt install portaudio19-dev

# on Arch Linux
sudo pacman -S ffmpeg
sudo pacman -S portaudio

# on MacOS using Homebrew (https://brew.sh/)
brew install ffmpeg
brew install portaudio

# on Windows using Chocolatey (https://chocolatey.org/)
choco install ffmpeg

# on Windows using Scoop (https://scoop.sh/)
scoop install ffmpeg
  1. Clone or fork this repository
  2. Install the backend and frontend environmet sh install_playground.sh
  3. Run the backend cd backend && source venv/bin/activate && flask run --port 8000
  4. In a different terminal, run the React frontend cd interface && yarn start

License

This repository and the code and model weights of Whisper are released under the MIT License.

About

Build real time speech2text web apps using OpenAI's Whisper https://openai.com/blog/whisper/

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 68.1%
  • Python 22.3%
  • HTML 6.6%
  • CSS 2.2%
  • Shell 0.8%