Could not create cudnn handle

Hi everyone,

I kept receiving the “could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR” when using deeplabcut.train_network function

Cuda: 10.0
tensorflow-gpu: 1.13.1
cudnn: 7.4.1.5
GPU: RTX 2080
OS: ubuntu18.04

also tried cuda 10.1, no luck. No idea what to do next.

Any help will be appreciated.

Thanks,

Lingling

I have not used CUDA 10 yet myself (I use this Docker container for Linux: https://github.com/MMathisLab/Docker4DeepLabCut2.0) but google colab used CUDA 10 now, and it defiintely works. I would check your compatability of cuDNN, CUDA + TF: https://stackoverflow.com/questions/50622525/which-tensorflow-and-cuda-version-combinations-are-compatible

Finally I succeeded with CUDA10.0 using tensorflow-gpu 1.13 (compiled by myself).

My configuration is

GPU: RTX 2080
OS: ubuntu 18.04
CUDA: 10.0.130
cudnn: 7.4.2
python: 3.6.7

Now I can proceed my project with Deeplabcut. Thanks MWMathis for the great toolbox.

Lingling

1 Like

Hi, I am also facing the same issue can u please share me the link to pre-compiled tensorflow-gpu, the above provided a link is not working

thank you
appreciate any help I can get

Hi,

Sorry the pre-compiled tensorflow-gpu was accidentally deletely and I can not find it now.

Two different ways you can try.

  1. (easu way): use conda to install tensorflow-gpu and configure all the deeplabcut enviornment.

  2. Compiled your own tensorflow-gpu. Here is my documentation about how to compile a tensorflow-gpu version.

Hope everything goes well.

Lingling

HI,
I have complied with the link you provided but i am still getting the same error

Error:

INFO:tensorflow:global_step/sec: 0
2019-05-17 13:04:53.550306: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-05-17 13:04:53.565467: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
INFO:tensorflow:Error reported to Coordinator: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.

thank you

Assuming that CUDA and cuDNN are installed correctly, I sometimes get this error if my GPU memory is full because a process hasn’t terminated. Might be worth checking nvidia-smi.

1 Like

yes, i checked the memory also its using almost all of the memory while running.after error it goes down to normal again

Hi,

What are your configurations? tensorflow version, cuda version, cudnn version, GPU, and nvidia driver et.al?

Also you can use conda to config the deeplabcut.

Here is another my documentation you may refer to

please find my system configurations below
tensorflow version =1.13.1,
cuda version =10.0,
cudnn version ==7.4,
GPU= RTX 2070,
nvidia driver= 410.104

How large are your images?

And you may need to update cuDNN…

my exact cudnn version is 7.4.2,

should i install 7.5 version

i am just running the sample tutorial code but all the memory is allocated to python and i am getting the below error
Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

sample code used

import tensorflow as tf
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(512, activation=tf.nn.relu),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer=‘adam’,
loss=‘sparse_categorical_crossentropy’,
metrics=[‘accuracy’])

model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)

For the precompiled tensorflow_gpu = 1.13, it also worked a while before the ubuntu updated.

Then I changed to conda installation. tensorflow_gpu = 1.12 works well using conda installation.

Hi thanks for your help, i managed to fix the GPU memory issue by adding

TF_FORCE_GPU_ALLOW_GROWTH=true as an environmental variable as mentioned in

below document

https://www.tensorflow.org/alpha/guide/using_gpu

1 Like

I tried this but still getting the same error

Thanks. Tried and this method solves the ‘Could not create cudnn handle’ problem.

just to add to the answer, open ‘sudo gedit /etc/environment/’

and add the line

TF_FORCE_GPU_ALLOW_GROWTH=true

to make sure it worked

echo $TF_FORCE_GPU_ALLOW_GROWTH

you should see true

otherwise

source /etc/environment/

thanks a lot Manoj!!!