# Best approach to Normalize Phase Contrast Images

Hi Masters of Image Processing!

I’ve recently wanted to use CARE on phase contrast (PH) images, and I am wondering what the best way to normalize these images is.

The built-in quantile-based way works fine but I have found that due to the uneven nature of the background in PH, creating patches for training benefits from including the background.

But when there are background patches with no signal, quantile-based normalization exaggerates the background.

I tried using a fixed-value normalization (Just dividing by an arbitrary value) to ensure the data lies between [0;1] but I think I could do better…

One idea is to use a mode or median-based normalization. That is, considering that in my images there will always be over 50% of background, I can

1. Smooth the image slightly
2. Compute the mode (or median)
3. Divide the image by this value times an arbitrary value

I looked around different papers working on phase contrast images, but most normalize the data after some initial processing of the PH image (after a Laplacian or Hessian or Gradient) and none mention how the PH image could be normalized.

Do you think that the median-based approach makes sense? What other methods have you used/know of?

Thanks for any input!

[EDIT : Post was sent before it was completed]

4 Likes

Hi @oburri,

Makes sense.

The CARE data generation always computes the normalization from the entire image, and then applies it to all extracted patches from that image. Hence, as long as you don’t have purely background images, the normalization should work for generating the training patches.

I don’t quite understand. The mode (=max?) is the 100th percentile and the median is the 50th percentile.

Can you upload an example image and point out the specific problem with the normalization?

Best,
Uwe

1 Like

Hi @uschmidt83,

I saw that after I started looking into the CSBDeep code. Thing is that I get slightly better results when I normalized the phase contrast data by dividing it with a single value rather than the quantile-based approach, maybe because the background is closer to the mode (value that appears most often), which is close to the median.
But rather than divide by an arbitrary value, I wanted to divide it by, say 10 times the mode.

You can snatch the dataset from Zenodo here
(500MB), I’ll send you the notebook as well (or if you create an account to renkulab.io you can get it directly.

https://renkulab.io/projects/528/

Thanks for taking the time replying!

Hi @oburri,

I had a look at the data and noticed the uneven illumination. This likely makes it harder for the neural network to do its job. Hence, I suggest to correct for the uneven illumination (e.g., see below) and then use CARE with default normalization settings.

Best,
Uwe

I just had a look at your Jupyter notebook after I posted my reply above. A few thoughts:

• You already correct for the uneven illumination, so my post above is quite useless.
• I’d downscale only by a factor of 2.
• Your patch size of 32x32 pixels is quite small. I’d strongly suggest to go for 128x128 or even larger and accordingly make the batch_size smaller (4, 8, or 16).

Uwe

Hi Uwe,

Thanks I am trying this now!
So considering that the images are downsampled 2x, which means images are now about 512x512 px. How many patches per image can I do?
If I make images that are 256x256, am I not only allowed to make 4 or so patches per image? What’s the rule or rule of thumb?

Testing with these settings

`````` 'axes': 'YXC',
'n_channel_in': 1,
'n_channel_out': 1,
'probabilistic': True,
'unet_residual': True,
'unet_n_depth': 2,
'unet_kern_size': 5,
'unet_n_first': 16,
'unet_last_activation': 'linear',
'unet_input_shape': (None, None, 1),
'train_loss': 'laplace',
'train_epochs': 100,
'train_steps_per_epoch': 400,
'train_learning_rate': 0.0004,
'train_batch_size': 8,
'train_tensorboard': True,
'train_checkpoint': 'weights_best.h5',
'train_reduce_lr': {'factor': 0.5, 'patience': 10, 'min_delta': 0}}
``````

And a patch size of 128x128 I get overfitting of the data. What could I tweak?