-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix cost_estimate.py #1810
base: main
Are you sure you want to change the base?
Fix cost_estimate.py #1810
Conversation
Fixes several bugs to get cost_estimate.py working again. Bugs include: ``` python scripts/cost_estimate.py ... File "/data/steven_test/workspace/lm-evaluation-harness/scripts/cost_estimate.py", line 91, in main task_dict={taskname: tasks.get_task(taskname)()}, AttributeError: module 'lm_eval.tasks' has no attribute 'get_task' ``` ``` Traceback (most recent call last): ... File "/data/steven_test/workspace/lm-evaluation-harness/lm_eval/utils.py", line 288, in _wrapper return fn(*args, **kwargs) TypeError: simple_evaluate() got an unexpected keyword argument 'lm' ``` ``` Traceback (most recent call last): ... File "/data/steven_test/workspace/lm-evaluation-harness/lm_eval/utils.py", line 288, in _wrapper return fn(*args, **kwargs) TypeError: simple_evaluate() got an unexpected keyword argument 'task_dict' ``` ``` File "/data/steven_test/workspace/lm-evaluation-harness/scripts/cost_estimate.py", line 24, in loglikelihood for ctx, cont in requests: TypeError: cannot unpack non-iterable Instance object ```
@xksteven @haileyschoelkopf @lintangsutawika I wonder whether
Perhaps it may be better to consider this functionality as a token count estimator (rather than a token cost estimator) and users could use it to determine their cost from the current pricing of the model they intend to use. Yes, this still runs up against point 1, but still as an estimate it may be useful. |
@sepiatone I am not sure I have enough time to make a proper abstraction from this into what you're suggesting. I wanted the base code to work to then be able to edit the model pricing myself to get an estimate of the cost. After doing that I made the PR to just get the previous code working again. I'm happy to have someone else take over to convert it into a general cost estimator for running different models. Also add a personal preference I like small PRs as they're easier to review. |
Perhaps I should've been clearer - I'm not suggesting anything specific with this PR other than what has already been done. What is being done in this PR is required whatever further direction is taken. Perhaps it's better to take this discussion (having a token count estimator) to a separate issue. |
@xksteven @sepiatone At the time we wrote this code, there was just the OpenAI ada/babbage/currie/devinci APIs and computing the costs was quite easy. Now this is a much bigger lift and I think that it would probably make sense to delete it because anything that works would need to be very bespoke and can be broken by policy changes at companies. Maybe we can add a warning somewhere that says "this might be really expensive, enter y if you are sure you want to do it" whenever someone runs a sufficiently large eval task. |
Fixes several bugs to get cost_estimate.py working again. Bugs include:
Needed to add
Otherwise wikitext evaluation fails and crashes the program.