-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent image placeholder and wrong images #5
Comments
Thanks for pointing out this issue! We already conducted a post-processing to detect possible inconsistent placeholders. I will double-check the data to see whether we uploaded the wrong version for the multimodal questions. Thanks again |
@IsakZhang are there any updates? I found multiple inconsistencies:
Also, in one sample for Portuguese ( How did you handle this? UpdateI investigated further and found more issues with the Portuguese data, where there are missing images in the samples -- in most cases those images contained a space between the
|
Dear authors,
Thanks for your open sourcing! It seems that the correct image placeholder would be
(image)[image-x.png]
or(image)[image-x.jpg]
. However, I find a lot of inconsistent image placeholders in the multimodal split, such as(image)[image-1]
,[image-3.png]
,(image)[image-1.png]
,(image)[image-4.png.]
,(image)[image2.jpg]
and(image)[image-30. jpg]
...Moreover, in most of these scenarios, the image is wrong or there is no image in the downloaded dataset.
It seems that there is something wrong with the preprocessing script, or there is some misunderstanding?
Thanks again if you can help me!
The text was updated successfully, but these errors were encountered: