GitHub - Tapall-AI/MeViS_Track_Solution_2024: [CVPR 2024 Challenge] 1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation

1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation

Mingqi Gao^1,4,+, Jingnan Luo^2,+, Jinyu Yang^1,*, Jungong Han^3,4, Feng Zheng^1,2,*

¹ Tapall.ai ² Southern University of Science and Technology ³ University of Sheffield ⁴ University of Warwick

⁺ Equal Contributions, ^* Corresponding Authors

📃 Technical Report 🔖 Awesome Work List in Video Object Segmentation

📍 Installation

We test the code in the following environments, other versions may also be compatible: Python=3.9, PyTorch=1.10.1, CUDA=11.3

pip install -r requirements.txt
pip install 'git+https://github.com/facebookresearch/fvcore' 
pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
cd models/ops
python setup.py build install
cd ../..

📍 Training

Download MUTR's checkpoint from HERE (Swin-L, joint-training on Ref-COCO series and Ref-YouTube-VOS).
Run following commands to fine-tune MUTR on MeViS:

python -m torch.distributed.launch \
    --nproc_per_node 1 \      # num of gpus during training
    --master_port 10010 \
    --use_env train.py \
    --with_box_refine \
    --binary \
    --dataset_file mevis \
    --epochs 2 \
    --lr_drop 1 \
    --resume [MUTR checkpoint] \
    --output_dir [output path] \
    --mevis_path [MeViS path] \
    --backbone swin_l_p4w7

Please note that different num of gpus lead to different scores (as discussed HERE).

📍 Inference

Our checkpoint is available on Google Drive.

python inference_mevis.py \
    --with_box_refine \
    --binary \
    --output_dir [output path] \
    --resume [checkpoint path] \
    --ngpu 1 \
    --batch_size 1 \
    --backbone swin_l_p4w7 \
    --mevis_path [MeViS path] \
    --split valid \
    --sub_video_len 30

📖 Citation

If you find our solution useful for your research, please consider citing with this BibTeX:

@misc{gao20241st,
      title={1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation}, 
      author={Mingqi Gao and Jingnan Luo and Jinyu Yang and Jungong Han and Feng Zheng},
      year={2024},
      eprint={2406.07043},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

🙌 Acknowledgement

The solution is based on MUTR and MeViS. Thanks for the authors for their efforts.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
assets		assets
datasets		datasets
docs		docs
models		models
tools		tools
util		util
.gitignore		.gitignore
README.md		README.md
engine.py		engine.py
inference_mevis.py		inference_mevis.py
opts.py		opts.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation

📍 Installation

📍 Training

📍 Inference

📖 Citation

🙌 Acknowledgement

About

Releases

Packages

Languages

Tapall-AI/MeViS_Track_Solution_2024

Folders and files

Latest commit

History

Repository files navigation

1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation

📍 Installation

📍 Training

📍 Inference

📖 Citation

🙌 Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages