-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Support for Non-OpenAI Models in Token Trimming #1605
Conversation
THANK YOU for doing this! It has been wrong since the beginning lol so much appreciated! This is an important fix! |
6ca727a
to
fc638e8
Compare
8f5441a
to
af97657
Compare
Updated. So now, if people don't specify |
Great! |
But I'm not sure if we want to make it a service. If we can resolve the onnxruntime-node issue, maybe we could move it to the core. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - thanks for creating a thoughtful solution to this. It could use some more thorough testing - given lots of callsites are changed, but fundamentally looks good. It could be moved into core as well and probably should be
We should consider adding another wrapper trimTokensByModelClass(context, modelClass) which under the hood gets the model and the maxInputTokens from it and use that in generateObject etc wherever we are doing it based off of the model defaults and not a configured value.
updated. moved it to core |
Test Scenarios:
Results:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is well done.
Great work!
related:
#1439
#1565
need to be tested: #1439 (comment)