-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Different behaviour of Keras 2.0.9 and 2.0.8 #8353
Comments
I am seeing the same behavior. The following code works for 2.0.8 but not 2.0.9
|
Maybe this was implemented: |
I encounter the same problem Code here https://github.com/fchollet/keras/blob/master/keras/backend/tensorflow_backend.py#L50
attempts to allocates the GPU resources. Not sure it is an expected behavior |
This is a known issue. The specific method call re-registers all the GPUs/resources instead of just counting the number of available devices. I intend to send a patch during the weekend. |
The problem is that the Reproducing the problem is tricky as you need also more than 1 GPU. Here is a pure TF example which shows the problem:
As we see it has registered only GPU1. The results can be confirmed using nvidia-smi:
On the same Python shell let's call the method to get the list of available GPUs:
Ooops! It just registered both GPUs! Let's confirm with nvidia-smi:
As we see the process has acquired also the GPU0 and it's using all the available resources. |
To temporarily solve this problem, you can make only specific GPUs visible before placing any keras import:
Like this, keras will only use the GPU with ID 1. |
This affects Horovod as well. Unfortunately, |
Please take a look at the outstanding fix: #8377 |
Closing as this is resolved |
With PIP installation of Keras, when using 2.0.9 (latest as of now), when importing Keras with tf backend, it allocates all available GPU resources immediately after import - however, in 2.0.8, the GPU allocation was not happening immediately after importing Keras. Is this an expected behaviour in 2.0.9?
The text was updated successfully, but these errors were encountered: