-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Language Model Redesign #268
Conversation
…into lm_adding_ngram
…d of phrase-level averages. Added variation and min/max to timing calculations.
…into lm_adding_ngram
…into lm_adding_ngram
…ded HuggingFace model to the eval script
…into lm_adding_ngram
…t does. Added it as an option to the Mixture model
…s. Instead just keep a list of log-probs for each character possibility. No changes in prediction probabilities with a slight decrease in prediction time.
… extending hypotheses
…tswith check since now we force that to be true. Only convert single space characters to SPACE_CHAR, not all whitespace.
… language/main. Improved efficiency of list appends in causal model. Pass current sequence string in the tuple instead of rebuilding each time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a lot of work! Thanks for your effort on this. The main things I would like to revisit would be parameters and maybe the location of symbols. See my comments below.
027901b
to
28d65e4
Compare
…with custom exception. Added install instructions to language module README
…with custom exception. Added install instructions to language module README
74b9a8d
to
ff65d67
Compare
…into lm_adding_ngram
…irectly from json
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small cleanups are needed, and there is a typo in lm params? Otherwise, this is ready to merge and iterate on! @lawhead will want to take another look
"kenlm": { | ||
"model_file": { | ||
"description": "Name of the pretrained model file", | ||
"value": "lm_dec19_char_large_12gram.kenlm", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*.arpa
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've shifted to using the binary .kenlm files instead of the .arpa files since they have faster model load times. I believe I already updated all the documentation to point to the proper .kenlm file location
Overview
This PR reworked the existing GPT-2 language model implementation to fix several critical bugs. The new CasualLanguageModel class fixes these bugs and allows for the insertion of any causal model from HuggingFace (or a locally trained one). This PR also includes the KenLMLanguageModel class that implements an n-gram model and the MixtureLanguageModel class that allows for the mixture of two or more other models.
Ticket
https://www.pivotaltracker.com/story/show/183975978
https://www.pivotaltracker.com/story/show/184017969
https://www.pivotaltracker.com/story/show/184589162
https://www.pivotaltracker.com/story/show/184365440
Contributions
Test
Documentation
Changelog