GitHub - XLearning-SCU/2021-NeurIPS-NCR

PyTorch implementation for Learning with Noisy Correspondence for Cross-modal Matching (NeurIPS 2021 Oral).

Update

2022-10-17, We provide the image urls of CC152K from Conceptual Captions (CC), which might be helpful to your research.


|-- cc152k
|   |-- dev_caps_img_urls.csv
|   |-- test_caps_img_urls.csv
|   `-- train_caps_img_urls.csv

Use img2dataset to download images from the csv files. More details

Introduction

NCR framework

Requirements

Python 3.7
PyTorch ~1.7.1
numpy
scikit-learn
Punkt Sentence Tokenizer:

import nltk
nltk.download()
> d punkt

Datasets

MS-COCO and Flickr30K

We follow SCAN to obtain image features and vocabularies.

CC152K

We use a subset of Conceptual Captions (CC), named CC152K. CC152K contains training 150,000 samples from the CC training split, 1,000 validation samples and 1,000 testing samples from the CC validation split. We follow the pre-processing step in SCAN to obtain the image features and vocabularies.

Download Dataset

Training and Evaluation

Training new models from scratch

Modify the data_path and vocab_path, then train and evaluate the model(s):


# CC152K
python ./NCR/run.py --gpu 0 --workers 2 --warmup_epoch 10 --data_name cc152k_precomp --data_path data_path --vocab_path vocab_path

# MS-COCO: noise_ratio = {0, 0.2, 0.5}
python ./NCR/run.py --gpu 0 --workers 2 --warmup_epoch 10 --data_name coco_precomp --num_epochs 20 --lr_update 10 --noise_ratio 0.2 --data_path data_path --vocab_path vocab_path

# Flickr30K: noise_ratio = {0, 0.2, 0.5}
python ./NCR/run.py --gpu 0 --workers 2 --warmup_epoch 5 --data_name f30k_precomp --noise_ratio 0.2 --data_path data_path --vocab_path vocab_path

It should train the model from scratch and evaluate the best model.

Pre-trained models and evaluation

The pre-trained models are available here:

CC152K model Download
MS-COCO 0% noise model Download
MS-COCO 20% noise model Download
MS-COCO 50% noise model Download
F30K 0% noise model Download
F30K 20% noise model Download
F30K 50% noise model Download

Modify the model_path, data_path, vocab_path in the evaluation.py file. Then run evaluation.py:

python ./NCR/evaluation.py

Note that for MS-COCO, please set split to testall, and fold5 to false (5K evaluation) or true (Five-fold 1K evaluation).

Experiment Results:

Citation

If NCR is useful to your research, please cite the following paper:

@article{huang2021learning,
  title={Learning with Noisy Correspondence for Cross-modal Matching},
  author={Huang, Zhenyu and Niu, Guocheng and Liu, Xiao and Ding, Wenbiao and Xiao, Xinyan and Wu, Hua and Peng, Xi},
  journal={Advances in Neural Information Processing Systems},
  volume={34},
  year={2021}
}

License

Apache License 2.0

Acknowledgements

The code is based on SGRAF and SCAN licensed under Apache 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
NCR		NCR
cc152k		cc152k
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cc152k.png		cc152k.png
framework.png		framework.png
mscoco_flickr30k.png		mscoco_flickr30k.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Update

Introduction

NCR framework

Requirements

Datasets

MS-COCO and Flickr30K

CC152K

Training and Evaluation

Training new models from scratch

Pre-trained models and evaluation

Experiment Results:

Citation

License

Acknowledgements

About

Releases

Packages

Contributors 2

Languages

License

XLearning-SCU/2021-NeurIPS-NCR

Folders and files

Latest commit

History

Repository files navigation

Update

Introduction

NCR framework

Requirements

Datasets

MS-COCO and Flickr30K

CC152K

Training and Evaluation

Training new models from scratch

Pre-trained models and evaluation

Experiment Results:

Citation

License

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages