Dear all,
I am a pathologist, interested in digital pathology and deep learning. I have basic knowledge in python (including numpy, pandas, sci-kit,…) and am familiar with some machine learning theory. By trying to set up a pipeline for CNN using whole slides, I encountered several implementation problems, that I would be very happy, if you could help me with as a primary medical professional. My aim is to:
- Scan histopathological slides at 40X resolution, saved as .svs.
- Manually annotate these slides at low resolution to only include tumor tissue (and not sorrounding normal tissue), e.g. using QuPath.
- Use this annotation to perform all following preprocessing steps (e. g. color normalization) in python, but only on the annotated tumor area. Here, I would like to create a pipeline that allows to use different magnification levels (e. g. 20X, 40X) and different tile sizes (e. g. 512x512, 256 vs. 256).
- Run the algorithms on these tiles in Python.
My particular questions are:
- How would you implement the manual annotation? Is it possible to use one annotation mask generated at low level for all upcoming steps? Or is it necessary to generate a PNG-image at the specified resolution and perform annotation individually at each magnification?
- Which file formats for the images in Python are recommended? Is downscaling necessary?
These questions may be self-explanatory for a professional in this field, so please bear with me as a primary medical practioner.
Thanks a lot!