You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your great work.
I try to use the same case in the tabular math word problem. the picture name in the test is 28034.png. I set the temperature=0,
but we get different result.
The text was updated successfully, but these errors were encountered:
It seems like the input text has some difference with the original input text, which may influence the output. Can you give another try with the following input text:
{
"prompt": """Based on this table about 'Children's weights (lbs)', solve the following problem. In the end, output your final answer using the JSON format: {"answer": "<YOUR ANSWER>"}.As part of a statistics project, a math class weighed all the children who were willing to participate. How many children weighed exactly 31 pounds? (Unit: children)"""
}
When we manually draw the ppt for the case study in the paper, some special tokens like spaces and '\n' in the input text were omitted to save spaces. Thus the rendered input text is slightly different from the original ones. You can access the whole test samples MMTab-eval_test_data_49K_llava_jsonl_format.jsonl from the HuggingFace.
But I think your discovering confirms an interesting and important defect of MLLMs, i.e., sometimes it was vulnerable to the adversarial attack of the tiny disturbance in the input text or image.
Thanks for your great work.
I try to use the same case in the tabular math word problem. the picture name in the test is 28034.png. I set the temperature=0,
but we get different result.
The text was updated successfully, but these errors were encountered: