YOLO-UniOW: Efficient Universal Open-World Object Detection

The official implementation of YOLO-UniOW [arxiv]

Zero-shot Performance on LVIS Dataset

YOLO-UniOW-S/M/L has been pre-trained from scratch and evaluated on the LVIS minival. The pre-trained weights can be downloaded from the link provided below.

Model	#Params	AP^mini	AP_r	AP_c	AP_f	FPS (V100)
YOLO-UniOW-S	7.5M	26.2	24.1	24.9	27.7	98.3
YOLO-UniOW-M	16.2M	31.8	26.0	30.5	34	86.2
YOLO-UniOW-L	29.4M	34.6	30.0	33.6	36.3	64.8

Experiment Setup

Data Preparation

For preparing open-vocabulary and open-world data, please refer to docs/data.

Installation

Our model is built with CUDA 11.8 and PyTorch 2.1.2. To set up the environment, refer to the PyTorch official documentation for installation guidance. For detailed instructions on installing mmcv, please see docs/installation.

conda create -n yolouniow python=3.9
conda activate yolouniow
pip install torch==2.1.2 torchvision==0.16.2 --index-url https://download.pytorch.org/whl/cu118
pip install mmcv==2.1.0 -f https://download.openmmlab.com/mmcv/dist/cu118/torch2.1/index.html
pip install -r requirements.txt
pip install -e .

Training & Evaluation

For open-vocabulary model training and evaluation, please refer to run_ovod.sh

# Train Open-Vocabulary Model
./tools/dist_train.sh configs/pretrain/yolo_uniow_s_lora_bn_5e-4_100e_8gpus_obj365v1_goldg_train_lvis_minival.py 8 --amp

# Evaluate Open-Vocabulary Model
./tools/dist_test.sh configs/pretrain/yolo_uniow_s_lora_bn_5e-4_100e_8gpus_obj365v1_goldg_train_lvis_minival.py \
    pretrained/yolo_uniow_s_lora_bn_5e-4_100e_8gpus_obj365v1_goldg_train_lvis_minival.pth 8

For open-world model training and evaluation, please follow the steps provided in run_owod.sh. Ensure that the model is trained before proceeding with the evaluation.

# 1. Extract text/wildcard features from pretrained model
python tools/owod_scripts/extract_text_feats.py --config $CONFIG --ckpt $CHECKPOINT --save_path $EMBEDS_PATH

# 2. Fine-tune wildcard features
./tools/dist_train.sh $OBJ_CONFIG 8 --amp

# 3. Extract fine-tuned wildcard features
python tools/owod_scripts/extract_text_feats.py --config $OBJ_CONFIG --save_path $EMBEDS_PATH --extract_tuned

# 4. Train all owod tasks
python tools/owod_scripts/train_owod_tasks.py MOWODB $OW_CONFIG $CHECKPOINT

# 5. Evaluate all owod tasks
python tools/owod_scripts/test_owod_tasks.py MOWODB $OW_CONFIG --save

To train and evaluate on specific datasets and tasks, use the commands below:

# Train owod task
DATASET=$DATASET TASK=$TASK THRESHOLD=$THRESHOLD SAVE=$SAVE \
./tools/dist_train_owod.sh $CONFIG 8 --amp

# Evaluate owod task
DATASET=$DATASET TASK=$TASK THRESHOLD=$THRESHOLD SAVE=$SAVE \
./tools/dist_test.sh $CONFIG $CHECKPOINT 8

Acknowledgement

This project builds upon YOLO-World, YOLOv10, FOMO, and OVOW. We sincerely thank the authors for their excellent implementations!

Citation

If our code or models help your work, please cite our paper and yolov10:

@article{liu2024yolouniow,
  title={YOLO-UniOW: Efficient Universal Open-World Object Detection},
  author={Lihao Liu and Juexiao Feng and Hui Chen and Ao Wang and Lin Song and Jungong Han and Guiguang Ding},
  journal={arXiv preprint arXiv:2412.20645},
  year={2024}
}

@article{wang2024yolov10,
  title={YOLOv10: Real-Time End-to-End Object Detection},
  author={Wang, Ao and Chen, Hui and Liu, Lihao and Chen, Kai and Lin, Zijia and Han, Jungong and Ding, Guiguang},
  journal={arXiv preprint arXiv:2405.14458},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
assets		assets
configs		configs
data		data
demo		demo
docs		docs
requirements		requirements
third_party/mmyolo		third_party/mmyolo
tools		tools
yolo_world		yolo_world
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run_ovod.sh		run_ovod.sh
run_owod.sh		run_owod.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YOLO-UniOW: Efficient Universal Open-World Object Detection

Zero-shot Performance on LVIS Dataset

Experiment Setup

Data Preparation

Installation

Training & Evaluation

Acknowledgement

Citation

About

Releases

Packages

Languages

License

THU-MIG/YOLO-UniOW

Folders and files

Latest commit

History

Repository files navigation

YOLO-UniOW: Efficient Universal Open-World Object Detection

Zero-shot Performance on LVIS Dataset

Experiment Setup

Data Preparation

Installation

Training & Evaluation

Acknowledgement

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages