Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU available but not used #232

Closed
alevangel opened this issue Apr 14, 2022 · 12 comments · Fixed by #234
Closed

GPU available but not used #232

alevangel opened this issue Apr 14, 2022 · 12 comments · Fixed by #234
Labels
Bug Something isn't working Config Model

Comments

@alevangel
Copy link

alevangel commented Apr 14, 2022

Executing CFLOW model on Colab, PyTorch returns this

GPU available: True, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

GPU available but not used. Set "accelerator" and "devices" using Trainer(accelerator='gpu', devices=1).

Even if i'm setting gpus=1 on the config.yml of cflow.

To Reproduce
train.py --model cflow

GPU Model
GPU 0: Tesla T4

Versions

pytorch-lightning             1.6.1
torch                         1.10.0+cu111
torchaudio                    0.10.0+cu111
torchmetrics                  0.7.3
torchsummary                  1.5.1
torchtext                     0.11.0
torchvision                   0.11.1+cu111
@samet-akcay
Copy link
Contributor

samet-akcay commented Apr 14, 2022

Hi @alevangel, Which GPU model do you have on Colab?

@alevangel
Copy link
Author

Hi @alevangel, Which GPU model do you have on Colab?

GPU 0: Tesla T4 (edited)

@ashwinvaidya17
Copy link
Collaborator

@alevangel Can you share your torch version?

@alevangel
Copy link
Author

@alevangel Can you share your torch version?

pytorch-lightning             1.6.1
torch                         1.10.0+cu111
torchaudio                    0.10.0+cu111
torchmetrics                  0.7.3
torchsummary                  1.5.1
torchtext                     0.11.0
torchvision                   0.11.1+cu111

@alevangel
Copy link
Author

I solved rolling back at version of pytorch-lightning==1.5.9

@andriy-onufriyenko
Copy link

I have the same problem with CFLOW model on my local machine.

GPU available: True, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

GPU available but not used. Set `accelerator` and `devices` using `Trainer(accelerator='gpu', devices=1)

python tools/train.py --model cflow

GPU:
NVIDIA GeForce GTX 1650 Ti

pytorch-lightning 1.6.1

@alevangel
Copy link
Author

@andriy-onufriyenko use pytorch-lightning==1.5.9

@andriy-onufriyenko
Copy link

andriy-onufriyenko commented Apr 14, 2022

@andriy-onufriyenko use pytorch-lightning==1.5.9

@alevangel Install pytorch-lightning==1.5.9 and have another problem now:

ERROR: pip's dependency resolver does not currently take into account all the packages that are
installed. This behaviour is the source of the following dependency conflicts.
anomalib 0.2.6 requires pytorch-lightning>=1.6.0, but you have
pytorch-lightning 1.5.9 which is incompatible.

ModuleNotFoundError: No module named 'torchtext.legacy'

@samet-akcay samet-akcay reopened this Apr 14, 2022
@samet-akcay
Copy link
Contributor

@andriy-onufriyenko, can you add torch>=1.8.1 above this line, and do pip install -e again?

@andriy-onufriyenko
Copy link

andriy-onufriyenko commented Apr 14, 2022

@andriy-onufriyenko, can you add torch>=1.8.1 above this line, and do pip install -e again?

@samet-akcay

albumentations>=1.1.0
einops>=0.3.2
kornia>=0.5.6
omegaconf>=2.1.1
opencv-python>=4.5.3.56
pandas>=1.1.0
pytorch-lightning>=1.6.0
torch>=1.8.1
torchvision>=0.9.1
torchtext>=0.9.1
wandb==0.12.9
matplotlib>=3.4.3

It's working but without GPU

GPU available: True, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

GPU available but not used. Set `accelerator` and `devices` using `Trainer(accelerator='gpu', devices=1)`

Trying edit config.yaml :

# PL Trainer Args. Don't add extra parameter here.
trainer:
  accelerator: 'gpu'

and got

ValueError: Unsupported accelerator found: gpu. Should be one of [null, ddp]

Parameter auto_select_gpus: true didn't affect.

@samet-akcay
Copy link
Contributor

I solved rolling back at version of pytorch-lightning==1.5.9

I found the problem. PyTorch Lightning deprecated and / or modified some of the Trainer configs after v1.6.0, which is the reason for not using the GPU despite being available.

The issue has been fixed in PR #234

@andriy-onufriyenko
Copy link

I solved rolling back at version of pytorch-lightning==1.5.9

I found the problem. PyTorch Lightning deprecated and / or modified some of the Trainer configs after v1.6.0, which is the reason for not using the GPU despite being available.

The issue has been fixed in PR #234

@samet-akcay

I checked it works for me:

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

Just replace all 11 files from PR 234 in to my project.

@samet-akcay samet-akcay changed the title CFLOW: GPU available but not used GPU available but not used Apr 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working Config Model
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants