Skip to content

Latest commit

 

History

History
43 lines (28 loc) · 2.21 KB

README.md

File metadata and controls

43 lines (28 loc) · 2.21 KB

Release Docker Pulls Build Licence

My Changes

For the following model, I added a change where each section transcribed 1 minute of audio, addressing the issue where longer audio files would return an error due to their length. The first section represents the first minute (0:00 - 1:00), the second section represents the following minute (1:00 - 2:00) and so on. Below is information regarding the Whisper ASR Webservice as a whole.

Whisper ASR Webservice

Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitask model that can perform multilingual speech recognition as well as speech translation and language identification. For more details: github.com/openai/whisper

Features

Current release (v1.4.1) supports following whisper models:

Quick Usage

CPU

docker run -d -p 9000:9000 -e ASR_MODEL=base -e ASR_ENGINE=openai_whisper onerahmet/openai-whisper-asr-webservice:latest

GPU

docker run -d --gpus all -p 9000:9000 -e ASR_MODEL=base -e ASR_ENGINE=openai_whisper onerahmet/openai-whisper-asr-webservice:latest-gpu

for more information:

Documentation

Explore the documentation by clicking here.

Credits

  • This software uses libraries from the FFmpeg project under the LGPLv2.1