Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CUDA 11 and CUDA 10 with some Clean Up #687

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
datasets/*
logs/*
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@

datasets/*
logs

# compilation and distribution
Expand Down
13 changes: 5 additions & 8 deletions docker/Dockerfile → Dockerfile_CUDA10
Original file line number Diff line number Diff line change
Expand Up @@ -9,20 +9,17 @@ RUN apt-get update && apt-get install -y \
RUN ln -sv /usr/bin/python3 /usr/bin/python

# create a non-root user
ARG USER_ID=1000
RUN useradd -m --no-log-init --system --uid ${USER_ID} appuser -g sudo
RUN echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers
USER appuser
COPY . /home/appuser
WORKDIR /home/appuser

# https://github.com/facebookresearch/detectron2/issues/3933
ENV PATH="/home/appuser/.local/bin:${PATH}"
RUN wget https://bootstrap.pypa.io/pip/3.6/get-pip.py && \
python3 get-pip.py --user && \
python3 get-pip.py && \
rm get-pip.py

# install dependencies
# See https://pytorch.org/ for other options if you use a different version of CUDA
RUN pip install --user tensorboard cmake # cmake from apt-get is too old
RUN pip install --user torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/cu101/torch_stable.html
RUN pip install --user -i https://pypi.tuna.tsinghua.edu.cn/simple tensorboard opencv-python cython yacs termcolor scikit-learn tabulate gdown gpustat faiss-gpu ipdb h5py
RUN pip install tensorboard cmake # cmake from apt-get is too old
RUN pip install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/cu101/torch_stable.html
RUN pip install -i https://pypi.tuna.tsinghua.edu.cn/simple tensorboard opencv-python cython yacs termcolor scikit-learn tabulate gdown gpustat faiss-gpu ipdb h5py
28 changes: 28 additions & 0 deletions Dockerfile_CUDA11
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# FROM nvidia/cuda:10.1-cudnn7-devel
FROM nvidia/cuda:11.1.1-cudnn8-devel-ubuntu18.04

# https://github.com/NVIDIA/nvidia-docker/issues/1632
#RUN rm /etc/apt/sources.list.d/cuda.list
#RUN rm /etc/apt/sources.list.d/nvidia-ml.list
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update && apt-get install -y \
python3-opencv ca-certificates python3-dev git wget sudo ninja-build
RUN ln -sv /usr/bin/python3 /usr/bin/python

# create a non-root user
COPY . /home/appuser
WORKDIR /home/appuser

# https://github.com/facebookresearch/detectron2/issues/3933
ENV PATH="/home/appuser/.local/bin:${PATH}"
RUN wget https://bootstrap.pypa.io/pip/3.6/get-pip.py && \
python3 get-pip.py && \
rm get-pip.py

# install dependencies
# See https://pytorch.org/ for other options if you use a different version of CUDA
RUN pip install tensorboard cmake # cmake from apt-get is too old
RUN pip install torch==1.10 torchvision==0.11.1 -f https://download.pytorch.org/whl/cu111/torch_stable.html
# RUN pip install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/cu101/torch_stable.html
# RUN pip install -i https://pypi.tuna.tsinghua.edu.cn/simple tensorboard opencv-python cython yacs termcolor scikit-learn tabulate gdown gpustat faiss-gpu ipdb h5py
Copy link
Contributor Author

@KleinYuan KleinYuan Jan 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed https://pypi.tuna.tsinghua.edu.cn/simple, due to constant timeout. Do we need this ? @L1aoXingyu

RUN pip install tensorboard opencv-python cython yacs termcolor scikit-learn tabulate gdown gpustat faiss-gpu ipdb h5py
27 changes: 17 additions & 10 deletions docker/README.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,29 @@
# Use the container

### Build Container

```shell script
cd docker/
# Build:
docker build -t=fastreid:v0 .
# Launch (requires GPUs)
nvidia-docker run -v server_path:docker_path --name=fastreid --net=host --ipc=host -it fastreid:v0 /bin/sh
# Build with the corresponding CUDA version

# CUDA 10
docker build -t=fastreid:v0 -f Dockerfile_CUDA10 .
# CUDA 11
docker build -t=fastreid:v0 -f Dockerfile_CUDA11 .
```

## Install new dependencies
### Run Container

Add the following to `Dockerfile` to make persist changes.
```shell script
RUN sudo apt-get update && sudo apt-get install -y vim
```
# Launch (requires GPUs)
nvidia-docker run -v ${PWD}:/home/appuser --name=fastreid --net=host --ipc=host -it fastreid:v0
```

### Run Training

Next, follow the [Get Started Doc](https://github.com/JDAI-CV/fast-reid/blob/master/GETTING_STARTED.md#compile-with-cython-to-accelerate-evalution).

Or run them in the container to make temporary changes.

## A more complete docker container

If you want to use a complete docker container which contains many useful tools, you can check my development environment [Dockerfile](https://github.com/L1aoXingyu/fastreid_docker)
If you want to use a complete docker container which contains many useful tools, you can check my development environment [Dockerfile](https://github.com/L1aoXingyu/fastreid_docker)