Skip to content

myeze/WhisperBuildTrimmer

Repository files navigation

Release Docker Pulls Build Licence

My Changes

For the following model, I added a change where each section transcribed 1 minute of audio, addressing the issue where longer audio files would return an error due to their length. The first section represents the first minute (0:00 - 1:00), the second section represents the following minute (1:00 - 2:00) and so on. Below is information regarding the Whisper ASR Webservice as a whole.

Whisper ASR Webservice

Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitask model that can perform multilingual speech recognition as well as speech translation and language identification. For more details: github.com/openai/whisper

Features

Current release (v1.4.1) supports following whisper models:

Quick Usage

CPU

docker run -d -p 9000:9000 -e ASR_MODEL=base -e ASR_ENGINE=openai_whisper onerahmet/openai-whisper-asr-webservice:latest

GPU

docker run -d --gpus all -p 9000:9000 -e ASR_MODEL=base -e ASR_ENGINE=openai_whisper onerahmet/openai-whisper-asr-webservice:latest-gpu

for more information:

Documentation

Explore the documentation by clicking here.

Credits

  • This software uses libraries from the FFmpeg project under the LGPLv2.1

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published