Replies: 1 comment
-
You got it, it's not properly enforced, just heavily downscored. If you have lax enough pruning parameters then the decoder will still explore low prob AM labels which will then get boosted by LM scores of valid words. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hey guys!
From experience with your models, I know that with an external scorer enabled (NGram LM) the model will never predict anything that is not in the vocabulary of the LM. But I was wondering how this actually works because I could not quite find that mechanism in the code.
make_ngram
will convert a prefix into a list of words:STT/native_client/ctcdecode/scorer.cpp
Lines 369 to 396 in bb75afb
The scorer will give a harsh
OOV_SCORE
of-1000.0
whenever a word is not in the language model vocab:STT/native_client/ctcdecode/scorer.cpp
Lines 329 to 331 in bb75afb
But what happens if a character of a word is silent and can not really be predicted by the acoustic model.
This character would have a very low probability and is possibly pruned in the decoding step.
If the character is missing, how can the NGram LM still reconstruct a full word and not always give OOV and discard the prefix?
Or is it the job of the acoustic model to also output characters that are silent?
Coqui with LM scorer seems to behave like "find the closest valid hypothesis with full words in vocab" but how is this behaviour enforced?
Beta Was this translation helpful? Give feedback.
All reactions