Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phrase hints in inference calls #1821

Open
pvanickova opened this issue Jan 7, 2019 · 9 comments
Open

Phrase hints in inference calls #1821

pvanickova opened this issue Jan 7, 2019 · 9 comments

Comments

@pvanickova
Copy link

It would be helpful to provide phrase hints (context words) during inference time to boost probability of certain domain specific phrases in the transcription.

E.g. when passing an audio to python api, user could pass a list of likely phrases in the context
phrase_hints = ['transverse compound fracture', 'high bp', 'per os']
ds.enableDecoderWithLM(args.alphabet, args.lm, args.trie, LM_ALPHA, LM_BETA, phrase_hints)

@kdavis-mozilla
Copy link
Contributor

Couldn't this be addressed by a custom language model?

@pvanickova
Copy link
Author

The context may change dynamically - something that is a context for one inference wouldn't be a context for another one, e.g. different departments are using different terminology, different shops have different inventory, different parts of an app may have different context options, ...

Rebuilding the language model for each case would mean a lot of language models and very frequent update of the models with new phrases.

Plus sufficiently updating a general English language model with just few high probability phrases would require a lot of dummy text generation to assign the phrase enough probability (just guessing about this one).

@kdavis-mozilla
Copy link
Contributor

Good point.

One of the things we are thinking about if the ability to dynamically change language models, see #1678 (Allow use of several decoders (language models) with a single model in the API). Would that be a close enough fit to your use case? (I know you'd still have to create several language models which may be too much of a pain.)

The reason I'm asking is we are trying to decide how to best add just this functionality.

@pvanickova
Copy link
Author

I've added my comments for the multiple language model feature in its thread.

Having the option to provide a list of expected phrases for the context still would be very useful in my scenario (pulling subset of hint phrases from a frequently updated dictionary based on the source of the call) .

Once there's a good way to combine probability from multiple language models, this might be implemented as an additional on-the-fly generated mini language model with high probabilities of the injected phrases perhaps?

@kdavis-mozilla
Copy link
Contributor

Thanks!

@axchanda
Copy link

axchanda commented Feb 4, 2019

's a good way to combine probability from multiple language models, this might be implemented as an additional on-the-fly generated mini language model with high probabilities of the injected phrases perhaps
@pvanickova Have you got the required phrase hints done? I am also in search for the same. Please help me out! Thanks!!!!

@SephVelut
Copy link

Even with dynamic models, its more accurate to provide context in the form of phrase hints at the time of inference. Because a language model with those phrase hints would apply to each inference, whereas you would rather have certain phrases apply on certain inferences during a session, not all.

@nmstoker
Copy link
Contributor

nmstoker commented Sep 4, 2019

If #432 is completed, people would be able to experiment with ways of handling hints and context assistance more easily (possibly with a view to then including the more broadly applicable successful ones as part of the API)

I like the hints idea but I think it might be valuable to gather together the distinct kinds of scenarios people want to be able to solve. In some cases distinct LMs make sense (switching between them or in combination, eg to extent vocabulary) and in others hints of specific words or potentially classes of word make sense (eg if you expect a number reply it could be handy to bias in favour of numbers whilst still coping with other kinds of response)

@MrityunjoyS
Copy link

I'm also trying to use hinting and substitution methods to rectify errors and improve recognition. I'm using deep speech model only as ASR. I've used deep speech 2 model to build my own pbm and scorer as I'm trying to improvise the ASR for Hindi language. I'm facing issues like while saying "Haa", the model is only catching "a". Need to rectify that, can you please suggest how can I implement 'hints' or 'substitution' for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants