CNN prediction on Large images



Hi, I trained a CNN for doing image classification on (41, 41, 7) shape training images and the actual image size on which the prediction is to be applied is of the size (2048, 2048, 100). I create image patches and index the (x,y) co ordinate of the patch as I need that for overlaying results in the end. The prediction is running on GPU however it is running prediction for a single patch at a given time. Is there a way to accelerate this process, where I am able to run multiple patches, remember the patch startX and startY index and get the prediction for each patch and then pool in the results properly so that each patch prediction goes to it’s proper place?

In other words for people doing CNN for image classification, how do you guys apply the prediction of CNN on the real microscopy images which are 10 to 20 fold bigger in size?

Programming language is Python using tensorflow and Keras as backends.

@frauzufall @fjug @ctrueden



I think this is missing some information to answer the question properly:
Are you really doing Image classification, i.e. predicting a value (corresponding to some class label) for a given patch?
Or are you doing Image segmentation, i.e. predicting a value for each pixel in a given patch?

If you are doing segmentation, you are (hopefully) using a fully convolutional network. In this case you can increase the input patch shape to the shape that barely fits into your GPU to speed up the prediction. Note that prediction is less memory needy than training, so your patches can be way larger than (41, 41, 7).
Also, it might be a good idea to use some overlap or halo to reduce artifacts at the patch boundaries.
This might be useful:

If you are doing classification, you could stack multiple batches and present the network with an (N, 41, 41, 7) input, where N is the number of batches. You will need to define some mapping of batch ids to patch position to keep the spatial information.
This might be useful:

Note that the image classification approach might not be suited for many cases, because the prediction can depend on the patch size / how objects of interest are covered by patches. Imagine a single object of interest being at the corner of 4 patches.
This really depends on the application though.


Thank you very much for this wonderful answer. Yes I am doing image classification on patches of the image. But yeah I know that FCN for segmentation and CNN for classification. I think the second part of your answer of stacking multiple batches and presenting the network with (N, 41, 41, 7) is what I was looking for.

To give you an idea of the sort of problem I am trying to do this for is like imagine you have RGB images of cats and dogs, say 41 by 41 pixels and use CNN for training on them to do the classification and now I present the network with 500 by 500 pixel image containing 10 dogs and 10 cats, then I make a window of 41 by 41 pixel, stride it in (x,y) and try to get the regions in the image where there are cats and dogs. This gives multiple rectangles in the regions where the network finds cats or dogs but then I can do Non-Maximal suppression to narrow down the rectangles I get.

So yeah I will implement the second part of your answer. Thanks a lot again.


You that worked, with mapping of batch id to patch position. Thanks for speeding it up. Cheers.