This repo contains the code for our paper SeMask: Semantically Masked Transformers for Semantic Segmentation. It is based on Mask2Former and FaPN-full.
- † denotes the backbones were pretrained on ImageNet-22k and 384x384 resolution images.
- Pre-trained models can be downloaded following the instructions given under tools.
Method | Backbone | Crop Size | mIoU | mIoU (ms+flip) | #params | config | Checkpoint |
---|---|---|---|---|---|---|---|
SeMask-L Mask2Former FaPN | SeMask Swin-L† | 640x640 | 56.88 | 58.25 | 227M | config | checkpoint |
SeMask-L Mask2Former MSFaPN | SeMask Swin-L† | 640x640 | 57.00 | 58.25 | 224M | config | checkpoint |
-
Build the DCNv2 module which is compatible with Pytorch v1.7.1.
-
Follow the installation instructions for Mask2Former.
See Preparing Datasets for Mask2Former.
See Getting Started with Mask2Former.
@article{jain2021semask,
title={SeMask: Semantically Masking Transformer Backbones for Effective Semantic Segmentation},
author={Jitesh Jain and Anukriti Singh and Nikita Orlov and Zilong Huang and Jiachen Li and Steven Walton and Humphrey Shi},
journal={arXiv},
year={2021}
}