-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding new task: Boxes #1557
base: main
Are you sure you want to change the base?
Adding new task: Boxes #1557
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! I've left some comments, I see that the dataset is still local here which needs to be addressed.
Would you be able to try to replicate a number from the paper and post the result here?
dataset_path: json | ||
dataset_name: null | ||
dataset_kwargs: | ||
data_files: {'test': 'test-subsample-states-t5.jsonl'} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This points to a local file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this points to a file inside dataset dir stored locally. I downloaded dataset from git, it is not on hugging face. How should I go about it excluding uploading data on hugging face?
I can try to replicate only flan-t5-xl results (since GPT3, 3.5, 4 are not an option). Please let me know if I understood it correctly :) Thank you in anticipation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flan-T5 results would be great!
Could you open an issue on the authors' github repo and ask them if they would be alright with uploading the dataset to Huggingface as a gated repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Hailey, apologies for delay
I asked Dr. Sebastian Schuster and create this gated repo Boxes task
Further, I am attaching results for flan-t5 base and xl which are almost same as the results presented in paper.
here I tried to produce results based on number of operations affecting box state just as reported in paper but the task does not necessary need this in general hence implementation is kept same.
A task probing to what extent a language model can infer the final state of an entity given an English description of the initial state and a series of state-changing operations is presented.