My Changes

For the following model, I added a change where each section transcribed 1 minute of audio, addressing the issue where longer audio files would return an error due to their length. The first section represents the first minute (0:00 - 1:00), the second section represents the following minute (1:00 - 2:00) and so on. Below is information regarding the Whisper ASR Webservice as a whole.

Whisper ASR Webservice

Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitask model that can perform multilingual speech recognition as well as speech translation and language identification. For more details: github.com/openai/whisper

Features

Current release (v1.4.1) supports following whisper models:

openai/whisper@v20231117
SYSTRAN/faster-whisper@v0.10.0

Quick Usage

CPU

docker run -d -p 9000:9000 -e ASR_MODEL=base -e ASR_ENGINE=openai_whisper onerahmet/openai-whisper-asr-webservice:latest

GPU

docker run -d --gpus all -p 9000:9000 -e ASR_MODEL=base -e ASR_ENGINE=openai_whisper onerahmet/openai-whisper-asr-webservice:latest-gpu

for more information:

Documentation/Run
Docker Hub

Documentation

Explore the documentation by clicking here.

Credits

This software uses libraries from the FFmpeg project under the LGPLv2.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

My Changes

Whisper ASR Webservice

Features

Quick Usage

CPU

GPU

Documentation

Credits

Files

README.md

Latest commit

History

README.md

File metadata and controls

My Changes

Whisper ASR Webservice

Features

Quick Usage

CPU

GPU

Documentation

Credits