-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new evaluation metrics #934
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/934
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit d0743e1 with merge base 72d2518 (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
5a1803e
to
130bfc4
Compare
if not sure if all eval metrics are relevant for llama quant, cc @HDCharles to take a look |
we can already specify any number of tasks with the --tasks argument, normally if I wanted to make it easier to run a small set of hard coded sets of experiments i would write an sh file that specified these things, not modify the lower level eval runner to have multiple ways to specify the tasks. the current solution is specifying all tests explicitly in benchmarks.sh and we made evals.sh to do the same though I haven't added all the tests there yet. If you want to make it easier to run those sets i'd maybe add them there? Is this a larger suggestion that the benchmarks we list in the README should be changed as well? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment
I think we need more eval metrics to test the different techniques on llama. Though that could be done by updating the eval.sh instead. Also, we can add more benchmarks to the readme file in future. |
5776bc0
to
c87a56d
Compare
c87a56d
to
d0743e1
Compare
@HDCharles just to confirm this is what you have in mind right? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks better
Added new tests to llama/eval.sh for more extensive testing on larger metrics