Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full per-sample metadata for the 400m and CC2.5B training sets #10

Open
vishaal27 opened this issue Oct 13, 2023 · 1 comment
Open

Full per-sample metadata for the 400m and CC2.5B training sets #10

vishaal27 opened this issue Oct 13, 2023 · 1 comment

Comments

@vishaal27
Copy link

Hi, thanks for your great work and releasing both the metadata entries and the trained CLIP model weights. I was wondering if it would be possible for you to release the per-sample metadata (url, text caption etc) for both the datasets you released models for (400m and CC2.5B)---similar to how the laion-2b-en and datacomp1b splits are released.
Please let me know if this is in the pipeline or if they are already released, please point me to them.
Thanks!

@howardhsu
Copy link
Contributor

thx for your interest. We are working on that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants