Noise2Void for Fiji

Hi all,

there’s a new Jug-Lab Fiji plugin of Noise2Void available now (originally implemented in python). Any volunteers for testing an early version?

It’s a deep learning method for content aware denoising. Training can be done on single noisy images. This is the N2V paper: https://arxiv.org/abs/1811.10980 … And the SEM data reference book chapter, same data as in the n2v python training SEM example.

Here’s the update site: https://sites.imagej.net/N2V. More details + source code on our github page: https://github.com/juglab/N2V_fiji/

During training a loss plot and preview window is displayed to track the progress. Be aware that N2V will only remove pixel wise independent noise.

The train plugins provide you with a link to a ZIP file of the trained model which you can use to run the predict plugin on other images of the same type without retraining.

There’s a script template for batch prediction, open a new script and open Templates > ImageJ2 > N2V > BatchPredict (python) , if you run it, it asks for a folder containing the input images, an empty folder where the predicted output images will be saved to and the model file (the ZIP from training)

For GPU support (which is kind of necessary, training with GPU still takes hours) you should have CUDA 10 and a matching cuDNN version installed. Follow the official NVIDIA guides to set the environment variables for both libraries. After adding the N2V update site, in Fiji, open Edit > Options > TensorFlow... and install TF 1.13.1 GPU.

Enjoy and please let us know if there are issues :slight_smile:

21 Likes

Hello,
thanks for posting it. Do I understand correctly that the FiJi plugin is for training and execution. No more need of intalling python.

Thanks

Antonio

1 Like

Would the inference work with CPU only?

1 Like

@manerotoni precisely. You don’t need to install python.

@sebi06 it works also on CPU, but training will take ages.

3 Likes

Could you comment on what data and datasize you tested it?
Also is there any minimum datasize for the training that gives sensible results in your experience?

Seeing from the parameters one can adjust to train on small data very fast… But I guess that would defeat the point…

Can or should I train on different images from the same system? At the moment I see I can only train on one loaded image.

1 Like

Can it be, scripted via python?

So I tried it now on a small STED dataset 1024x1024 8-bit.
With 50 batches. Since I have only a laptop at the moment with limited epochs. So I guess take it with a grain of salt.

Results are immediately impressive. I guess I have to find a way to validate the results for me and the people in my place to convince them.

In terms of usage. The batch size will depend on the dimensions of the image and the patch dimensions. So you could automate that instead of giving an error once you find out the batch number is insufficient.

Maybe display warnings if you think the user tries to load an insufficient amount of batches or the data is not suitable.

Maybe have a “batch” training option to train on a number of different images from the same setup if that is something the training could profit from.

Maybe maybe… batch, patch is a confusingly similar naming scheme also with other “batch” functions in Fiji… But complaining on a high level here :wink:

5 Likes

In the video I used the training image from this notebook. It’s 1690x2500px big. 10% of it is used for validation, not for training.

I can’t give you an exact number of how big the data has to be. The command will create batches and chose a random subset of each batch for the training. If there are not enough batches for the hole process, it starts again at the beginning of the batches and chooses new random subsets. If the data is too small, this will create problematic results. I have to add documentation to make it more clear what the parameters exactly do, sorry that’s not there yet.

You can also use the train on folder command to run the training on multiple images of the same type. At the end of the training it will provide you with the link to two ZIP files. One is called bestTrainedModelPath, that’s the trained model saved at the point where the validation loss was lowest, and the other one latestTrainedModelPath, which is the model from the last training step. For the train + predict command we use the latestTrainedModelPath ZIP file. You can save this ZIP somewhere and apply it to other images using the predict command.

I hope that made sense - let me know if you see a way to simplify this. I will add more documentation regarding parameters / input / output soon.

3 Likes

Thanks for trying it out and for the notes! The general answer for judging the results is: train longer! Looking at your images, this definitely applies :wink:

For the result I posted above, I trained for ~6h. 200 epochs, 200 steps, batch size 64, batch dimension length 180 (that means each batch is 180x180px big), patch dimension length 60 (that means from each batch the training pics a random 60x60 subset).

2 Likes

Do you mean like a Fiji script? That should work, the recorder should work too, but I have not tried it yet. In case you do, let me know if there are issues! I have to add a parameter which lets you hide the progress window though.

If you want to stay in python completely, you can just use the original python n2v library :slight_smile:

4 Likes

Did I miss it or is noise2void for sCMOS out yet?

1 Like

I don’t think so. Ping @tibuch @fjug

1 Like

Hi,
I think the N2V results will strongly depend on the particular chip.
To get rid of fixed patterns, you can record a stack of dark images, average them and then subtract them from your data, before training and prediction.
However, there are also other types of patterned noise I have seen in CMOS cameras, which may or may not screw up your results.
We are working on finding a solution.

7 Likes

I’m getting this in my log, any suggestions?

[INFO] Load TensorFlow..
[INFO] Using native TensorFlow version: TF 1.12.0
Using 10% of training data for validation
[INFO] Tile training and validation data..
[INFO] Generated 10656 tiles of shape [180, 180]
[INFO] Create session..
[INFO] Import graph..
[INFO] Normalizing..
[INFO] mean: 751.098
[INFO] stdDev: 1860.8287

java.util.concurrent.ExecutionException: java.lang.ClassCastException: net.imglib2.type.numeric.real.FloatType cannot be cast to net.imglib2.type.numeric.integer.UnsignedShortType
	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
	at java.util.concurrent.FutureTask.get(FutureTask.java:192)
	at de.csbdresden.n2v.train.N2VTraining.train(N2VTraining.java:157)
	at de.csbdresden.n2v.command.N2VTrainCommand.mainThread(N2VTrainCommand.java:110)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

Caused by: java.lang.ClassCastException: net.imglib2.type.numeric.real.FloatType cannot be cast to net.imglib2.type.numeric.integer.UnsignedShortType
	at net.imglib2.type.numeric.integer.UnsignedShortType.sub(UnsignedShortType.java:50)
	at de.csbdresden.n2v.util.N2VUtils.lambda$normalize$0(N2VUtils.java:37)
	at net.imglib2.loops.LoopBuilder$RunnableFactory$BiConsumerRunnable.run(LoopBuilder.java:263)
	at net.imglib2.loops.LoopUtils$LineProcessor.run(LoopUtils.java:147)
	at net.imglib2.loops.LoopUtils$LineProcessor.run(LoopUtils.java:147)
	at net.imglib2.loops.LoopBuilder.forEachPixel(LoopBuilder.java:135)
	at de.csbdresden.n2v.util.N2VUtils.normalize(N2VUtils.java:35)
	at de.csbdresden.n2v.util.N2VUtils.normalize(N2VUtils.java:178)
	at de.csbdresden.n2v.train.N2VTraining.normalize(N2VTraining.java:383)
	at de.csbdresden.n2v.train.N2VTraining.mainThread(N2VTraining.java:206)
	... 5 more
3 Likes

Hey @Heather_BrownHarding,

can you (just for testing) convert your image to 16-bit?

2 Likes

@haesleinhuepf

I did that and it unfortunately didn’t work. I’ve also updated to TF 1.13.1 GPU (CUDA 10.0, CuDNN 7.4) and that hasn’t helped. I’ll post and update if I find anything.

2 Likes

If anyone runs into this problem, try running “train + predict” instead of just train.

2 Likes

Hi @Heather_BrownHarding, nice to read from you again :slight_smile: And thank you for reporting the issue! I forgot to convert the input images to float type for the training plugin. I fixed it, if you update Fiji, it should work now.

2 Likes

11 posts were split to a new topic: Noise2Void for Fiji - Windows GPU support

Hi, thanks for posting this tool. I am getting the following error in log when reach the training phase (train+predict, TensorFlow 1.13.1 GPU (CUDA 10, CuDNN 7.6.4)) - looks to me like a memory allocation issue but I’m too newbie to start fixing - any suggestions?

Thanks, Mike

[INFO] Load TensorFlow..
[INFO] Using native TensorFlow version: TF 1.13.1 GPU (CUDA 10.0, CuDNN 7.4)
Using 10% of training data for validation
[INFO] Tile training and validation data..
[INFO] Generated 104 tiles of shape [180, 180]
[INFO] Create session..
[INFO] Import graph..
[INFO] Normalizing..
[INFO] mean: 0.19899796
[INFO] stdDev: 1.5125033
[INFO] Augment tiles..
[INFO] Prepare training batches...
57 blind-spots will be generated per training patch of size [60, 60].
[INFO] Prepare validation batches..
518 blind-spots will be generated per training patch of size [180, 180].
[INFO] Start training..
[INFO] Epoch 1/300 
java.util.concurrent.ExecutionException: java.lang.IllegalStateException: OOM when allocating tensor with shape[180,64,60,60] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[{{node concatenate_2/concat}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	 [[{{node metrics/n2v_mse/Mean}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
	at java.util.concurrent.FutureTask.get(FutureTask.java:192)
	at de.csbdresden.n2v.train.N2VTraining.train(N2VTraining.java:157)
	at de.csbdresden.n2v.command.N2VTrainPredictCommand.mainThread(N2VTrainPredictCommand.java:155)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalStateException: OOM when allocating tensor with shape[180,64,60,60] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[{{node concatenate_2/concat}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	 [[{{node metrics/n2v_mse/Mean}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	at org.tensorflow.Session.run(Native Method)
	at org.tensorflow.Session.access$100(Session.java:48)
	at org.tensorflow.Session$Runner.runHelper(Session.java:314)
	at org.tensorflow.Session$Runner.run(Session.java:264)
	at de.csbdresden.n2v.train.N2VTraining.runTrainingOp(N2VTraining.java:404)
	at de.csbdresden.n2v.train.N2VTraining.mainThread(N2VTraining.java:285)
	... 5 more
1 Like