-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using Decoder on single audio file #907
Comments
@ishan-modi: Only |
Thank you for the response got it ! Ok so now I am running on Flashlight backend models link and I want to recreate beam search decoding for a single audio file. |
@ishan-modi: Take a look at the instructions about how to prepare data for training (and testing). Also, if disk space and internet bandwidth is not a problem, try running the data preparation scripts for one of the recipes. That will download the Librispeech data and lay it out in a format that Train/Test/Decode binaries expect (including .wav, .lst files). Also, you may want to edit the subject title of this post for the benefit of others. |
Just a quick answer on list file: the expected format (tab or space separated between columns, there should be 3 or 4 columns)
|
I have associated doubts with that thread
|
Thank you so much for response. Issue is resolved !! |
Answers
|
-> 1 /home/../1.wav 1234.34 hello world
Hi @ishan-modi
|
@Adportas Inference is done purely on cpu (in a streaming fashion) while decode.cpp is working both on cpu and gpu for any network and then cpu for beam search decoding. Inference right now is working only with conv type networks. Decoder is taking list file and predicts transcription, so you don't need to have targets. At the same time decode.cpp also computes wer. Right now decode.cpp computes wer in any case, so if you just provide empty targets (there is some bug people reported to have empty targets, so please just put fake text there) you still obtain predictions and wer, but you can simply ignore wer. So please just use decode.cpp with some fake transcripts (or try even empty strings there)! |
How to use the decoder ?
I have downloaded the model files as mentioned in
https://github.com/facebookresearch/wav2letter/wiki/Inference-Run-Examples#download-the-example-trained-models-from-aws-s3
now I want to use decoder to decode an audio file
I have created a decoder.cfg
--am=path to acoustic_model.bin
--test=path to train.lst
--show
--sholetters
--uselexicon=true
--lm=path to language_model.bin
--lmtype=kenlm
--decodertype=wrd
--lmweight=2.5
--wordscore=1
--beamsize=500
--beamthreshold=25
--silweight=-0.5
--nthread_decoder=4
--smearing=max
--show=true
the train.lst contains the path to my audio file.
I am a bit new to this framework please guide me through and correct me if I am wrong
The text was updated successfully, but these errors were encountered: