@misc{CV2018,
author = {Donny You ([email protected])},
howpublished = {\url{https://github.com/donnyyou/PyTorchCV}},
year = {2018}
}
This repository provides source code for some deep learning based cv problems. We'll do our best to keep this repository up to date. If you do find a problem about this repository, please raise it as an issue. We will fix it immediately.
-
- VGG: Very Deep Convolutional Networks for Large-Scale Image Recognition
- ResNet: Deep Residual Learning for Image Recognition
- DenseNet: Densely Connected Convolutional Networks
- ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
- ShuffleNet V2: Practical Guidelines for Ecient CNN Architecture Design
-
- DeepLabV3: Rethinking Atrous Convolution for Semantic Image Segmentation
- PSPNet: Pyramid Scene Parsing Network
- DenseASPP: DenseASPP for Semantic Segmentation in Street Scenes
-
- SSD: Single Shot MultiBox Detector
- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
- YOLOv3: An Incremental Improvement
- FPN: Feature Pyramid Networks for Object Detection
-
- CPM: Convolutional Pose Machines
- OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
-
- Mask R-CNN
- ResNet: Deep Residual Learning for Image Recognition
- PSPNet: Pyramid Scene Parsing Network
Model | Backbone | Training data | Testing data | mIOU | Pixel Acc | Setting |
---|---|---|---|---|---|---|
PSPNet Origin | 3x3-ResNet101 | ADE20K train | ADE20K val | 41.96 | 80.64 | - |
PSPNet Ours | 7x7-ResNet101 | ADE20K train | ADE20K val | 44.18 | 80.91 | PSPNet |
- SSD: Single Shot MultiBox Detector
Model | Backbone | Training data | Testing data | mAP | FPS | Setting |
---|---|---|---|---|---|---|
SSD-300 Origin | VGG16 | VOC07+12 trainval | VOC07 test | 0.772 | - | - |
SSD-300 Ours | VGG16 | VOC07+12 trainval | VOC07 test | 0.786 | - | SSD300 |
SSD-512 Origin | VGG16 | VOC07+12 trainval | VOC07 test | 0.798 | - | - |
SSD-512 Ours | VGG16 | VOC07+12 trainval | VOC07 test | 0.808 | - | SSD512 |
- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Model | Backbone | Training data | Testing data | mAP | FPS | Setting |
---|---|---|---|---|---|---|
Faster R-CNN Origin | VGG16 | VOC07 trainval | VOC07 test | 0.699 | - | - |
Faster R-CNN Ours | VGG16 | VOC07 trainval | VOC07 test | 0.706 | - | Faster R-CNN |
- YOLOv3: An Incremental Improvement
- OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
- Mask R-CNN
Take PSPNet as an example. ("tag" could be any string, include an empty one.)
- Training
cd scripts/seg/cityscapes/
bash run_fs_pspnet_cityscapes_seg.sh train tag
- Resume Training
cd scripts/seg/cityscapes/
bash run_fs_pspnet_cityscapes_seg.sh train tag
- Validate
cd scripts/seg/cityscapes/
bash run_fs_pspnet_cityscapes_seg.sh val tag
- Testing:
cd scripts/seg/cityscapes/
bash run_fs_pspnet_cityscapes_seg.sh test tag