Skip to content

sjiang95/semclDataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SemCLDataset

This program is designed to convert VOC2012, Cityscapes, ADE20K or COCO datasets into anchor & non-anchor pairs for SemCL pretraining.

anchor non-anchor
plane non-plane

Prerequisites

Prepare datasets

Download VOC2012 (together with SegmentationClassAug from Semantic Boundaries Dataset and Benchmark. Check 2 and 3 here), Cityscapes, ADE20K or COCO as you need.

Download and extract it to wherever you want. Its directory structure should be the same with below.

$ tree /path/to/VOCdevkit/VOC2012 -d
├── Annotations
├── ImageSets
│   ├── Action
│   ├── Layout
│   ├── Main
│   ├── Segmentation
│   └── SegmentationAug # SBD
│       ├── test.txt
│       ├── train_aug.txt
│       ├── train.txt
│       ├── trainval_aug.txt
│       ├── trainval.txt
│       └── val.txt
├── JPEGImages
├── SegmentationClass
├── SegmentationClassAug # SBD
└── SegmentationObject

Download and extract leftImg8bit_trainvaltest.zip (raw images) and gtFine_trainvaltest.zip (labels).

$ tree /path/to/cityscapes -d -L 2
├── gtFine
│   ├── test
│   ├── train
│   └── val
└── leftImg8bit
    ├── test
    ├── train
    └── val

Download and extract training set train2017.zip and corresponding segmentation label stuffthingmaps_trainval2017.zip to the same folder. The latter is a set of gray scale segmentation annotations provided in cocostuff.

$ tree /path/to/coco -d
├── stuffthingmaps_trainval2017
│   ├── train2017
│   └── val2017
└── train2017

4 directories

Download the full ADE20K dataset(you may need an account for that). Extract it to your desired path. Notice that /ADE20K_2021_17_01 seems to be a date-based folder name. Please change relative path in dataset_conv.cpp in case maintainer of ADE20K update this dataset.

$ tree /path/to/ADE20K_2021_17_01 -d -L 3
└── images
    └── ADE
        ├── training
        └── validation

Build

cd semclDataset
mkdir build;cd build
cmake -GNinja ..
ninja

Please note that although we use ninja here, you can use any other available generator as you wish. For example, if you prefer make

cmake -GMake ..
make -j${nproc}

An executable dataset_conv would be generated.

Usage

Just simply give it your VOC2012, Cityscapes, ade20k or coco dataset path.

/path/to/dataset_conv --voc12 [path/to/VOCdevkit contains `VOC2012`] --aug --coco [/path/to/coco] --ade [/path/to/ADE20K_2021_17_01] --city [/path/to/cityscapes contains `gtFine` and `leftImg8bit`] --output_dir [desired output directory (default to current dir)]

Outputs will be written to ContrastivePairs under the path --output_dir points to.

$ tree /path/to/ContrastivePairs -L 1
├── ade20k
├── ADE_ImgList.txt
├── cityscapes
├── Cityscapes_ImgList.txt
├── coco
├── COCO_ImgList.txt
├── voc
└── VOC_ImgList.txt

4 directories, 4 files

Those *_ImgList.txt files will be read by dataloader in training program.

Citation

@InProceedings{quan2023semantic,
    author    = {Quan, Shengjiang and Hirano, Masahiro and Yamakawa, Yuji},
    title     = {Semantic Information in Contrastive Learning},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2023},
    pages     = {5686-5696}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published