Noise2Void for Fiji - Windows GPU support

[EDIT by @frauzufall: these are posts extracted from Noise2Void for Fiji regarding Windows GPU support]

Hi @frauzufall

I have another question. N2V is only running on my CPU despite having TF up and running. @haesleinhuepf 's CLIJ will run on the GPU. Any idea on why N2V would not be running on GPU?

2 Likes

If you open Edit > Options > TensorFlow, what does it say which version is active? CLIJ does not use TensorFlow, so this is no indicator of having a running GPU TensorFlow version.

It would also be interesting to open the Console window after starting the command - at the beginning it prints out which TF version is used.

1 Like

Good to know the differences! It shows the same in Console when I start the command as in the attached pic

1 Like

Hi @Heather_BrownHarding, I’m sorry I have so little experience with handling native libraries in Windows. I guess there is another TF library linked in your environment variables (System > Advanced system settings > Advanced > Environment Variables...) and the version I am printing to the console / is displayed in the TF command is just wrong. Maybe I can sit together with @haesleinhuepf at some point to figure this out.

2 Likes

For CUDA installation I found this very useful:
https://docs.nvidia.com/cuda/cuda-quick-start-guide/index.html

Also, when CUDA installs it will integrate with the version of Visual Studios that is installed on your computer, however it will also list all the versions that have not been integrated; i.e., the versions that you have not installed.

4 Likes

You’re right visual studio isn’t needed. This is our install protocol for Cuda in Windows. Hope it helps.
https://c4science.ch/w/bioimaging_and_optics_platform_biop/computers-servers/software/tensorflow-gpu/#prerequisites

We need the build tools only for Stardist if we’re running it on Python.

4 Likes

Hi, now the website works OK. I did choose TF with GPU and Fiji did download the required libaries, but when starting the training I get:

[INFO] Load TensorFlow..
[WARNING] Could not load native TF library TF 1.14.0 GPU (CUDA 10.0, CuDNN >= 7.4.1) D:\Software\Fiji\lib\win64\tensorflow_jni.dll: Can't find dependent libraries
[INFO] D:\Software\Fiji\lib\win64\tensorflow_jni.dll: Can't find dependent libraries

What do I overllook here?

Did you install CUDA and cuDNN? I tested it now on a VM, here’s what I did:

  • in Fiji, opened Edit > Options > TensorFlow... and switched to TF 1.14.0 GPU
  • restarting Fiji and opening this plugin again would show your error
  • Installed CUDA 10.0 from here
  • Downloaded cuDNN 7.6.5 from here (you need to register / login)
  • copied cuDNN files from the ZIP into CUDA as described here
  • add /PATH/TO/Fiji.app/lib/win64/tensorflow_jni.dll to your PATH (here’s a guide how to set system variables)

This was already enough for the error to disappear when opening Edit > Options > TensorFlow.., but I can’t test if it would actually work because my graphics card is not present in the VM. Will find a way soon.

2 Likes

I think I did it all of those “correctly”, but I will cross-check and see if I can get it working.

I checked and repeated the describes steps … but no progress.

The only difference is that I already had CUDA 10.1 installed. The cuDNN version 7.6.5.

Any other information that I could gather to find the problem?

Isn’t your path pointing to CUDA 10.1 instead of 10.0? Can you try to set it to 10.0? Maybe this guide helps with that.

Hi @frauzufall, a big thumbs up for bringing this amazing piece of software to Fiji! And a bummer that you couldn’t attend Neubias2020…

I installed the necessary programs for TensorFlow support, made sure the environment variable was set, and got N2V running, in principle.
However, it seems that processing happens exclusively on the CPU (which indeed takes ages :cry:), while my NVIDIA GeForce GTX 850M is sitting there doing absolutely nothing:
image

Fiji -> Options -> TensorFlow seems all set:
image

and the Fiji console tells me:
[INFO] Using native TensorFlow version: TF 1.14.0 GPU (CUDA 10.0, CuDNN >= 7.4.1)

Any clues to help me out?
Thanks!
Bram

1 Like

Hi @bramvdbroek,

alright, let’s make this work! Can you try if adding /PATH/TO/Fiji.app/lib/win64/tensorflow_jni.dll to your PATH as well (have a look how exactly the dll is called in your installation)?

Best, Deborah

2 Likes

That solved the issue. I already had 10.1 installed. After adding version 10.0 and modifying the CUDA_PATH it works fine now. Thx.

3 Likes

Hi Deborah,
Yes! That did the trick!
(BTW: It only worked after a restart of the computer. I thought that for paths this is normally not necessary, so in the end it might also be something else that did it.)

Thanks!
Bram

1 Like

Hi @frauzufall,

I’m also having the same issue where I have TF and everything installed, but N2V is only using the CPU on my Windows machine. I’m trying to add the tensorflow_jni.dll to my PATH as directed but am unsure exactly where to set it. Is it Path, CUDA_PATH, or CUDA_PATH_V10_0 in the attached picture or somewhere else?


TensorFlow

By the way, the plugin works great through the CPU on small image volumes, but I’d like to use it on much larger volumes.

Thanks,
Brian

Hi @bglancy,

You have to add tensorflow_jni.dll to ‘Path’.
Actually, whether this is required or not, it did not solve everything:
At first a windows restart seemed required. But when running N2V again tonight it was again only processing on the CPU (even though the path was still correctly set). Another restart fixed this issue.
That said, I had first tried some N2V settings that resulted in Java errors and crashes (primarily the 3D model), so it might be that the ‘GPU connection’ was messed up; I sometimes have the same with @haesleinhuepf’s (amazing) CLIJ.

I hope this somewhat helps.
Best regards,
Bram

I get these errors too on Linux, specially when using bigger images or image parameters, and sometimes only restart helps. Sometimes it complains about missing memory, sometimes cuDNN cannot be initialized. It’s an early release and might very well still contain memory leaks. I hope to get it more stable in the future, but in general, running it on a device with more GPU memory might help.

Sure @frauzufall, we appreciate it!

In the spirit of reporting bugs: I cannot get the 3D model to work. No matter what settings, it throws me this error:

EDIT/CORRECTION: I could fix that by setting patch and batch dimension lengths smaller than the number of slices. It detects and adapts one of the two, but apparently not the other.
However, now I get this error:


The same happens for 35x35x35 patch sizes; then the console says Input 1 has shape [100 17 17 17 64] and doesn't match input 0 with shape [100 16 16 16 64]. Looks like some sort of indexing mistake, right?

Another issue (unrelated) that I’m facing is the following:
We have a Acquifer HIVE system with a massive GPU card, that runs Windows Server 2019. The lowest CUDA version available for this OS is 10.1. Am I right in my assumption that TensorFlow 1.14.0 will not be amused if I try this? (And if so, do you / anyone know a solution, or should I just wait for updates?)

Thanks,
Bram

1 Like

Hi Bram,

regarding the first issue: patch dimension length needs to be a multiple of 4 at the moment. Sorry for not documenting all of that yet. batch dimension length must be at least as big or bigger as patch dimension length. The command cuts the test data into batch dimension length x batch dimension length pieces and for each training step, choose a random subpatch of patch dimension length x patch dimension length (in case of 2D training). I will align these parameters better with the python conventions in the future, so expect them to be renamed at some point… and documented! :slight_smile:

The default advanced parameters should work for 2D and 3D, but of course there’s the memory limit…

I am in the process of adding TensorFlow 1.15.0 as a choice in the TensorFlow options command, then it could be possible that it also runs on CUDA 10.1. I am not 100% sure since the version is not listed on TensorFlow’s Java page… If you don’t want to wait, remove anything with tensorflow in it’s name from Fiji.app/lib/win64/ and unzip these TF 1.15.0 GPU Win64 Java bindings there.

3 Likes