I am encountering a problem that has been raised a few times on this forum but I just cannot seem to find a fix for it. I am getting the following errors when I attempt to run the train_network command in my dlc-windowsGPU environment:
Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
- Cuda: 10.0 (according to the cudatoolkit in my dlc anaconda environment and the conda list cudnn command, though the nvidia-smi command tells me that 10.2 is installed??)
- tensorflow-gpu: 1.13.1
- cudnn: 7.6.5
- GPU: GeForce GTX 1650 SUPER
- NVIDIA driver: 442.19
- OS: windows 10
Looking at compatibility, (https://www.tensorflow.org/install/source_windows#gpu) tf 1.13 is compatible with CUDA 10 and cuDNN 7.4. Although I have cuDNN 7.6.5 installed on this PC, I have the GPU version of dlc running on my office computer with no problems and with those versions of cuda, cuDNN and tf (the only difference being that this second machine has windows 7 running and a different GPU/driver). This is what is confusing to me as it suggests that it is not necessarily a cuda-cuDNN-tf incompatibility issue (since they work together fine on the other machine).
Any advice you could give here would be much appreciated as I have really hit a wall.