CARE training - questions

Hi all,

I am using CARE for a little while now for 3D image restoration and I love how user friendly and efficient it is!! However, I still have a couple of questions about the training part that I hope you have the answers for:

  1. I have multiple data sets containing different images with multiple laser powers (for example I can have the same microscope image with 5%/20%/50% and 100% laser power). Assuming from 50% and up we consider it Ground Truth quality level, does CARE do well when using images with different qualities in the training? For example pairing :
  • low:5% -> GT:50%
  • low:5% -> GT:100%
  • low:20% -> GT:50%
  • low:20% -> GT:100%

I would argue that it could increase the robustness of the network but I don’t have a deep enough understanding of machine learning to confirm or refute that.

  1. I played a bit with the training parameters and wanted to see if different ones gave better restoration. But I have a little trouble understanding what unet_n_depth and unet_n_first represent exactly when considering the UNET, especially when those were the ones that changed the performance the most significantly. I am guessing that unet_n_depth would be the depth of the network but didn’t find a way to confirm that.

I hope my questions were clear enough and thank you for taking the time to read them.
In any case, thank you for this great tool and for all the hard work that was put into making it.


Hi @mitoGuy,

thanks for your words. Credit for the good usability of the code has to go to @uschmidt83 and of course also to @mweigert and @frauzufall and all the others that helped and gave feedback.

Now to your questions:
ad 1) your idea of using multiple noise levels as input during training does make a lot of sense. Martin has actually also done so for the training runs we reported on in the original paper. Using various noise levels as “ground-truth” turns also out to be a sensible idea. Are you familiar with the Noise2Void work by Lehtinen et al.? They show that you can train denoising networks even without clean targets. Cool, eh?

ad 2) They correspond to the U-Net parameters ‘n_depth’ and ‘n_first’.
Here what those mean:

  • n_depth (int) – number of resolution levels of U-Net architecture (just as you have expected).
  • n_first (int) – number of convolution filters for first U-Net resolution level (value is doubled after each downsampling operation).
    You can find this information, for example, almost at the bottom of this page.

I hope this helps,



Thank you very much for taking the time to read and answer my questions, things are much clearer now!

Are you familiar with the Noise2Void work by Lehtinen et al.?

Yes I am familiar with it! I actually was at one of the seminars you gave and it is indeed very impressive!

You can find this information, for example, almost at the bottom of this page

It seems that I didn’t go far enough…thank you for pointing me there!

Again, thank you and your whole team for the incredible work!