You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently Faster-Whisper only allows you to specify a single language or attempt to detect the language out of a pool of 94 languages. I would like to be able to limit what languages can be detected. Something like the following to limit autodetection to only English, Spanish and French. model.transcribe("audio.mp3", beam_size=5, language=["en", "es", "fr"])
The text was updated successfully, but these errors were encountered:
You can already do this, detect_language function retrns the probability of all languages, you can then exclude ll languages except these 3 and choose the one with the highest probability and pass it manually to transcribe
I see, I don't think the version of Faster-Whisper I was using (1.0.3) allowed you to return language probabilities like this. I wrote some code to return the desired languages. It works fine but I still think it would simpler for the user if you could just pass in a language list in the transcribe function. I'll let you decide to close this issue or not.
from scipy.io import wavfile
def limit_languages(audio, allowed_languages):
sampling_rate, audio_data = wavfile.read(audio)
model = WhisperModel("large-v2", device="cpu", compute_type="int8")
language, language_probability, all_language_probs = model.detect_language(audio_data)
score = 0
for language_code, language_prob in all_language_probs:
for allowed_language in allowed_languages:
if language_code == allowed_language:
if language_prob > score:
score = language_prob
detected_language = language_code
return detected_language```
Currently Faster-Whisper only allows you to specify a single language or attempt to detect the language out of a pool of 94 languages. I would like to be able to limit what languages can be detected. Something like the following to limit autodetection to only English, Spanish and French.
model.transcribe("audio.mp3", beam_size=5, language=["en", "es", "fr"])
The text was updated successfully, but these errors were encountered: