How to extract the number of image sets from h5 batch file

Hi folks,

I’m trying to augment my script for submitting batch jobs to our cluster, and I’d like to add the ability for the number of image sets to be read from the batch file. This functionality was present in a batch runner script that was part of the distribution in CP1.0, back when batch files were written in .mat file format. Below is the relevant portion of that code from the BatchRunner.py script from CP1.0 v5811:

[code]loadcmd = “”“load %s/Batch_data”"" % (datadir,)
printcmd = “”“fprintf(’\n\nNumberOfImageSets=%d\n\n’,handles.Current.NumberOfImageSets);”""
p = Popen(matlab, shell=True, stdin=PIPE, stdout=PIPE)
print >> p.stdin, loadcmd
print >> p.stdin, printcmd
print >> p.stdin, “exit”

output = p.stdout.readlines()
num_sets = int(output[0].split(’=’)[1])

Loop over batches, check status file, print out commands for those that need it

for start in range(2, num_sets + 1, batch_size):
end = start + batch_size - 1
<send the batch job to from ‘start’ to ‘end’ to the queue>
[/code]

I can read the batch files from my python generated job submission script with the h5py module:

[code]>>> import h5py

bf = h5py.File(‘Batch_data.h5’,‘r’)
bf.keys()
‘Measurements’, ‘Version’]

bf’Measurements’].keys()
‘2011-08-22-17-08-18’]

bf’Measurements/2011-08-22-17-08-18’]
<HDF5 group “/Measurements/2011-08-22-17-08-18” (2 members)>

bf’Measurements/2011-08-22-17-08-18’].keys()
‘Experiment’, ‘Image’]

bf’Measurements/2011-08-22-17-08-18/Image’].keys()
‘FileName_GFP’, ‘FileName_RFP’, ‘Frame_GFP’, ‘Frame_RFP’, ‘ImageNumber’, ‘Metadata_ImageLocation’, ‘Metadata_Series’, ‘Metadata_T’, ‘Metadata_WellID’, ‘Metadata_Z’, ‘PathName_GFP’, ‘PathName_RFP’, ‘Series_GFP’, ‘Series_RFP’, ‘URL_GFP’, ‘URL_RFP’]
<…find the number of image sets somewhere in the file system within batch_file…>
[/code]

So unlike the previous .mat batch files, there doesn’t seem to be any simple attribute (or group or dataset) within the batch file that indicates how many image sets are to be processed. What I’d like to know is if there is a way to search the hierarchy and find the number of image sets within the batch file. I’m not sure if the hierarchy produced in my sample file is general or specific. Could you give me a hint? I tried to attach the h5dump text representation of the batch file, but txt attachments are disallowed by the forum. I can send it by PM if you like.

Thanks,

Lee.

I’ve generated a number of test batches, and it seems the shape attribute of the dataset object located at /Measurements/<date_generated>/Image/FileName_/data has the number of images in the data set:

[code]>>> import h5py

bf’Measurements/2011-08-25-12-08-54/Image/FileName_GFP/data’]
<HDF5 dataset “data”: shape (96,), type “|O8”>

bf’Measurements/2011-08-25-12-08-54/Image/FileName_GFP/data’].shape[0]
96
[/code]

If you’re using a python script to manage your batch jobs, then you can use the h5py package to inspect the hdf5 batch file. Hope this is useful to other folks out there.

Lee.

I’m sure our lead programmer, Lee, could offer a better solution, but another way to get the image numbers might be the following (based on NewBatch.py in the batchprofiler folder):

import cellprofiler.pipeline as cpp import cellprofiler.measurements as cpmeas batch_file = "<insert path/filename of batch file here>" pipeline = cpp.Pipeline() pipeline.load(batch_file) m = cpmeas.load_measurements(batch_file) image_numbers = m.get_image_numbers() number_of_images = len(image_numbers) Regards,
-Mark

Hi Mark,

[quote=“mbray”]I’m sure our lead programmer, Lee, could offer a better solution, but another way to get the image numbers might be the following (based on NewBatch.py in the batchprofiler folder):

import cellprofiler.pipeline as cpp import cellprofiler.measurements as cpmeas batch_file = "<insert path/filename of batch file here>" pipeline = cpp.Pipeline() pipeline.load(batch_file) m = cpmeas.load_measurements(batch_file) image_numbers = m.get_image_numbers() number_of_images = len(image_numbers) Regards,
-Mark[/quote]

Thanks for the suggestion. However, when I try something like that, I get an error:

>>> pipeline.load('/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU=HarvardCollaboration=freezerstock/CPClusterTest/Batch1/KF_pipeline_test_output/Batch_data.h5') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "cellprofiler/pipeline.py", line 554, in load m = cpmeas.load_measurements(filename) File "cellprofiler/measurements.py", line 814, in load_measurements m.load(filename) File "cellprofiler/measurements.py", line 242, in load handles = loadmat(measurements_file_name, struct_as_record=True) File "/usr/cp2/lib/python2.6/site-packages/scipy/io/matlab/mio.py", line 139, in loadmat MR = mat_reader_factory(file_name, appendmat, **kwargs) File "/usr/cp2/lib/python2.6/site-packages/scipy/io/matlab/mio.py", line 102, in mat_reader_factory mjv, mnv = get_matfile_version(byte_stream) File "/usr/cp2/lib/python2.6/site-packages/scipy/io/matlab/miobase.py", line 212, in get_matfile_version % ret) ValueError: Unknown mat file type, version 0, 0

Which is odd, since I’m sure that this is a valid h5 batch file. I can ready it with h5py, and submit batch jobs using this file that are carried out fine. I’ll look into this further.

Thanks,

Lee.

It’s trying to fallback to read the file as a .mat file after failing as an .h5 file. Is it possible to upload the .h5 file?

Hi folks,

Looks like .h5 extensions are not allowed by the phpBB software. I’ve renamed the attached file to Batch_data.cp (originally Batch_data.h5) for the time being, but please bear in mind that it is an hdf5 file.

Thanks,

Lee.
Batch_data.cp (617 KB)

I’m able to get the image numbers using the latest version of measurements.py. I’m also able to read the image set file names and the pipeline from the file. The only thing I can think of is some problem in your development environment. Can you try it again with the latest version of measurements.py and we’ll get a more informative error than last time. Here’s how I opened and tested your file:

>>> import cellprofiler.measurements as cpmeas
>>> m = cpmeas.load_measurements("c:/temp/bad/img-1512/Batch_Data.h5")
>>> m.get_image_numbers()
array( 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
       35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51,
       52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64])
>>> m.get_feature_names("Image")
'FileName_GFP', 'FileName_RFP', 'Frame_GFP', 'Frame_RFP', 'Metadata_ImageLocation', 'Metadata_Series', 'Metadata_T', 'Metadata_WellID', 'Metadata_Z', 'PathName_GFP', 'PathName_RFP', 'Series_GFP', 'Series_RFP', 'URL_GFP', 'URL_RFP']
>>> print "\n".join([m.get_measurement("Image", "URL_GFP", image_set_number=n) for n in m.get_image_numbers()])
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001001000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001001000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001001000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001001000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001001000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001001000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001001000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001001000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001001000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001001000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001001000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001001000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001001000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001001000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001001000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001001000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001007000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001007000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001007000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001007000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001007000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001007000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001007000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001007000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001007000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001007000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001007000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001007000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001007000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001007000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001007000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001007000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001013000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001013000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001013000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001013000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001013000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001013000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001013000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001013000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001013000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001013000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001013000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001013000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001013000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001013000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001013000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001013000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001019000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001019000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001019000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001019000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001019000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001019000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001019000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001019000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001019000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001019000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001019000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001019000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001019000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001019000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001019000.flex
file:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset/001019000.flex
>>> >>> print m.get_experiment_measurement("Pipeline_Pipeline")
CellProfiler Pipeline: http://www.cellprofiler.org
Version:2
SVNRevision:11418

LoadImages:[module_num:1|svn_version:\'11411\'|variable_revision_number:11|show_window:True|notes:\x5B\x5D|batch_state:array(\x5B\x5D, dtype=uint8)]
    File type to be loaded:tif,tiff,flex,zvi movies
    File selection method:Text-Regular expressions
    Number of images in each group?:2
    Type the text that the excluded images have in common:Do not use
    Analyze all subfolders within the selected folder?:All
    Input image file location:Elsewhere...\x7C/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset
    Check image sets for missing or duplicate files?:
    Group images by metadata?:No
    Exclude certain files?:No
    Specify metadata fields to group by:
    Select subfolders to analyze:
    Image count:1
    Text that these images have in common (case-sensitive):.flex
    Position of this image in each group:1
    Extract metadata from where?:Both
    Regular expression that finds metadata in the file name:^(?P<WellID>.*).flex
    Type the regular expression that finds metadata in the subfolder path:(?P<ImageLocation>.*)
    Channel count:2
    Group the movie frames?:Yes
    Grouping method:Interleaved
    Number of channels per group:2
    Load the input as images or objects?:Images
    Name this loaded image:GFP
    Name this loaded object:Nuclei
    Retain outlines of loaded objects?:No
    Name the outline image:NucleiOutlines
    Channel number:1
    Rescale intensities?:Yes
    Load the input as images or objects?:Images
    Name this loaded image:RFP
    Name this loaded object:Nuclei
    Retain outlines of loaded objects?:No
    Name the outline image:NucleiOutlines
    Channel number:2
    Rescale intensities?:Yes

RescaleIntensity:[module_num:2|svn_version:\'6746\'|variable_revision_number:2|show_window:True|notes:\x5B\x5D|batch_state:array(\x5B\x5D, dtype=uint8)]
    Select the input image:GFP
    Name the output image:RescaledGFP
    Select rescaling method:Stretch each image to use the full intensity range
    How do you want to calculate the minimum intensity?:Custom
    How do you want to calculate the maximum intensity?:Custom
    Enter the lower limit for the intensity range for the input image:0
    Enter the upper limit for the intensity range for the input image:1
    Enter the intensity range for the input image:0.000000,1.000000
    Enter the desired intensity range for the final, rescaled image:0.000000,1.000000
    Select method for rescaling pixels below the lower limit:Mask pixels
    Enter custom value for pixels below lower limit:0
    Select method for rescaling pixels above the upper limit:Mask pixels
    Enter custom value for pixels below upper limit:0
    Select image to match in maximum intensity:None
    Enter the divisor:1
    Select the measurement to use as a divisor:None

RescaleIntensity:[module_num:3|svn_version:\'6746\'|variable_revision_number:2|show_window:True|notes:\x5B\x5D|batch_state:array(\x5B\x5D, dtype=uint8)]
    Select the input image:RFP
    Name the output image:RescaledRFP
    Select rescaling method:Stretch each image to use the full intensity range
    How do you want to calculate the minimum intensity?:Custom
    How do you want to calculate the maximum intensity?:Custom
    Enter the lower limit for the intensity range for the input image:0
    Enter the upper limit for the intensity range for the input image:1
    Enter the intensity range for the input image:0.000000,1.000000
    Enter the desired intensity range for the final, rescaled image:0.000000,1.000000
    Select method for rescaling pixels below the lower limit:Mask pixels
    Enter custom value for pixels below lower limit:0
    Select method for rescaling pixels above the upper limit:Mask pixels
    Enter custom value for pixels below upper limit:0
    Select image to match in maximum intensity:None
    Enter the divisor:1
    Select the measurement to use as a divisor:None

IdentifyPrimaryObjects:[module_num:4|svn_version:\'11331\'|variable_revision_number:9|show_window:True|notes:\x5B\x5D|batch_state:array(\x5B\x5D, dtype=uint8)]
    Select the input image:RescaledRFP
    Name the primary objects to be identified:Nuclei
    Typical diameter of objects, in pixel units (Min,Max):15,40
    Discard objects outside the diameter range?:Yes
    Try to merge too small objects with nearby larger objects?:No
    Discard objects touching the border of the image?:Yes
    Select the thresholding method:Otsu Global
    Threshold correction factor:0.8
    Lower and upper bounds on threshold:0.09,1.0
    Approximate fraction of image covered by objects?:0.01
    Method to distinguish clumped objects:Intensity
    Method to draw dividing lines between clumped objects:Intensity
    Size of smoothing filter:20
    Suppress local maxima that are closer than this minimum allowed distance:20
    Speed up by using lower-resolution image to find local maxima?:Yes
    Name the outline image:NucleiOutlines
    Fill holes in identified objects?:Yes
    Automatically calculate size of smoothing filter?:No
    Automatically calculate minimum allowed distance between local maxima?:No
    Manual threshold:0.0
    Select binary image:None
    Retain outlines of the identified objects?:Yes
    Automatically calculate the threshold using the Otsu method?:Yes
    Enter Laplacian of Gaussian threshold:0.5
    Two-class or three-class thresholding?:Two classes
    Minimize the weighted variance or the entropy?:Weighted variance
    Assign pixels in the middle intensity class to the foreground or the background?:Foreground
    Automatically calculate the size of objects for the Laplacian of Gaussian filter?:Yes
    Enter LoG filter diameter:5
    Handling of objects if excessive number of objects identified:Continue
    Maximum number of objects:500
    Select the measurement to threshold with:None
    Method to calculate adaptive window size:Image size
    Size of adaptive window:10

IdentifySecondaryObjects:[module_num:5|svn_version:\'11267\'|variable_revision_number:8|show_window:True|notes:\x5B\x5D|batch_state:array(\x5B\x5D, dtype=uint8)]
    Select the input objects:Nuclei
    Name the objects to be identified:Cells
    Select the method to identify the secondary objects:Propagation
    Select the input image:RescaledRFP
    Select the thresholding method:Otsu Global
    Threshold correction factor:0.8
    Lower and upper bounds on threshold:0.03,1.0
    Approximate fraction of image covered by objects?:0.01
    Number of pixels by which to expand the primary objects:10
    Regularization factor:0.05
    Name the outline image:CellOutlines
    Manual threshold:0.0
    Select binary image:None
    Retain outlines of the identified secondary objects?:Yes
    Two-class or three-class thresholding?:Two classes
    Minimize the weighted variance or the entropy?:Weighted variance
    Assign pixels in the middle intensity class to the foreground or the background?:Foreground
    Discard secondary objects that touch the edge of the image?:No
    Discard the associated primary objects?:No
    Name the new primary objects:FilteredNuclei
    Retain outlines of the new primary objects?:No
    Name the new primary object outlines:FilteredNucleiOutlines
    Select the measurement to threshold with:None
    Fill holes in identified objects?:Yes
    Method to calculate adaptive window size:Image size
    Size of adaptive window:10

ExpandOrShrinkObjects:[module_num:6|svn_version:\'11025\'|variable_revision_number:1|show_window:True|notes:\x5B\x5D|batch_state:array(\x5B\x5D, dtype=uint8)]
    Select the input objects:Nuclei
    Name the output objects:NucleiExpanded
    Select the operation:Expand objects by a specified number of pixels
    Number of pixels by which to expand or shrink:10
    Fill holes in objects so that all objects shrink to a single point?:No
    Retain the outlines of the identified objects for use later in the pipeline (for example, in SaveImages)?:Yes
    Name the outline image:NucleiExpandedOutlines

Crop:[module_num:7|svn_version:\'11408\'|variable_revision_number:2|show_window:True|notes:\x5B\x5D|batch_state:array(\x5B\x5D, dtype=uint8)]
    Select the input image:GFP
    Name the output image:GFP_CropByNucleus
    Select the cropping shape:Objects
    Select the cropping method:Coordinates
    Apply which cycle\'s cropping pattern?:Every
    Left and right rectangle positions:0,end
    Top and bottom rectangle positions:0,end
    Coordinates of ellipse center:500,500
    Ellipse radius, X direction:400
    Ellipse radius, Y direction:200
    Use Plate Fix?:No
    Remove empty rows and columns?:No
    Select the masking image:None
    Select the image with a cropping mask:None
    Select the objects:NucleiExpanded

IdentifyPrimaryObjects:[module_num:8|svn_version:\'11331\'|variable_revision_number:9|show_window:True|notes:\x5B\x5D|batch_state:array(\x5B\x5D, dtype=uint8)]
    Select the input image:GFP_CropByNucleus
    Name the primary objects to be identified:Nucleoli
    Typical diameter of objects, in pixel units (Min,Max):5,30
    Discard objects outside the diameter range?:Yes
    Try to merge too small objects with nearby larger objects?:No
    Discard objects touching the border of the image?:Yes
    Select the thresholding method:Otsu Global
    Threshold correction factor:1
    Lower and upper bounds on threshold:0.006,1.0
    Approximate fraction of image covered by objects?:0.01
    Method to distinguish clumped objects:Intensity
    Method to draw dividing lines between clumped objects:Intensity
    Size of smoothing filter:10
    Suppress local maxima that are closer than this minimum allowed distance:5
    Speed up by using lower-resolution image to find local maxima?:No
    Name the outline image:NucleoliOutlines
    Fill holes in identified objects?:No
    Automatically calculate size of smoothing filter?:No
    Automatically calculate minimum allowed distance between local maxima?:No
    Manual threshold:0.0
    Select binary image:Otsu Global
    Retain outlines of the identified objects?:Yes
    Automatically calculate the threshold using the Otsu method?:Yes
    Enter Laplacian of Gaussian threshold:.5
    Two-class or three-class thresholding?:Two classes
    Minimize the weighted variance or the entropy?:Weighted variance
    Assign pixels in the middle intensity class to the foreground or the background?:Foreground
    Automatically calculate the size of objects for the Laplacian of Gaussian filter?:Yes
    Enter LoG filter diameter:5
    Handling of objects if excessive number of objects identified:Continue
    Maximum number of objects:500
    Select the measurement to threshold with:None
    Method to calculate adaptive window size:Image size
    Size of adaptive window:10

RelateObjects:[module_num:9|svn_version:\'11386\'|variable_revision_number:2|show_window:True|notes:\x5B\x5D|batch_state:array(\x5B\x5D, dtype=uint8)]
    Select the input child objects:Nuclei
    Select the input parent objects:Nucleoli
    Calculate distances?:None
    Calculate per-parent means for all child measurements?:No
    Calculate distances to other parents?:No
    Parent name:None

MeasureObjectSizeShape:[module_num:10|svn_version:\'11400\'|variable_revision_number:1|show_window:True|notes:\x5B\x5D|batch_state:array(\x5B\x5D, dtype=uint8)]
    Select objects to measure:Nuclei
    Select objects to measure:Cells
    Select objects to measure:Nucleoli
    Calculate the Zernike features?:Yes

MeasureObjectIntensity:[module_num:11|svn_version:\'11135\'|variable_revision_number:3|show_window:True|notes:\x5B\x5D|batch_state:array(\x5B\x5D, dtype=uint8)]
    Hidden:2
    Select an image to measure:GFP
    Select an image to measure:RFP
    Select objects to measure:Nuclei
    Select objects to measure:Cells
    Select objects to measure:Nucleoli

MeasureTexture:[module_num:12|svn_version:\'11401\'|variable_revision_number:3|show_window:True|notes:\x5B\x5D|batch_state:array(\x5B\x5D, dtype=uint8)]
    Hidden:2
    Hidden:3
    Hidden:8
    Select an image to measure:GFP
    Select an image to measure:RFP
    Select objects to measure:Nuclei
    Select objects to measure:Cells
    Select objects to measure:Nucleoli
    Texture scale to measure:1
    Angles to measure:Horizontal
    Texture scale to measure:2
    Angles to measure:Horizontal
    Texture scale to measure:3
    Angles to measure:Horizontal
    Texture scale to measure:4
    Angles to measure:Horizontal
    Texture scale to measure:5
    Angles to measure:Horizontal
    Texture scale to measure:6
    Angles to measure:Horizontal
    Texture scale to measure:7
    Angles to measure:Horizontal
    Texture scale to measure:8
    Angles to measure:Horizontal
    Measure Gabor features?:Yes
    Number of angles to compute for Gabor:4

OverlayOutlines:[module_num:13|svn_version:\'11025\'|variable_revision_number:2|show_window:True|notes:\x5B\x5D|batch_state:array(\x5B\x5D, dtype=uint8)]
    Display outlines on a blank image?:No
    Select image on which to display outlines:RescaledRFP
    Name the output image:Cell&NucleiOutlinesOnRescaledRFP
    Select outline display mode:Color
    Select method to determine brightness of outlines:Max of image
    Width of outlines:1
    Select outlines to display:CellOutlines
    Select outline color:White
    Select outlines to display:NucleiOutlines
    Select outline color:Green

OverlayOutlines:[module_num:14|svn_version:\'11025\'|variable_revision_number:2|show_window:True|notes:\x5B\x5D|batch_state:array(\x5B\x5D, dtype=uint8)]
    Display outlines on a blank image?:No
    Select image on which to display outlines:RescaledGFP
    Name the output image:NucleoliOutlinesOnRescaledGFP
    Select outline display mode:Color
    Select method to determine brightness of outlines:Max of image
    Width of outlines:1
    Select outlines to display:NucleoliOutlines
    Select outline color:Red

SaveImages:[module_num:15|svn_version:\'11340\'|variable_revision_number:7|show_window:True|notes:\x5B\x5D|batch_state:array(\x5B\x5D, dtype=uint8)]
    Select the type of image to save:Image
    Select the image to save:Cell&NucleiOutlinesOnRescaledRFP
    Select the objects to save:None
    Select the module display window to save:None
    Select method for constructing file names:Sequential numbers
    Select image name for file prefix:None
    Enter single file name:\\\\g<WellID>_CellsNuclei_
    Do you want to add a suffix to the image file name?:No
    Text to append to the image name:
    Select file format to use:tiff
    Output file location:Elsewhere...\x7C/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/Batch1/KF_pipeline_test_output
    Image bit depth:8
    Overwrite existing files without warning?:No
    Select how often to save:Every cycle
    Rescale the images? :No
    Save as grayscale or color image?:Grayscale
    Select colormap:gray
    Store file and path information to the saved image?:Yes
    Create subfolders in the output folder?:No

SaveImages:[module_num:16|svn_version:\'11340\'|variable_revision_number:7|show_window:True|notes:\x5B\x5D|batch_state:array(\x5B\x5D, dtype=uint8)]
    Select the type of image to save:Image
    Select the image to save:NucleoliOutlinesOnRescaledGFP
    Select the objects to save:None
    Select the module display window to save:None
    Select method for constructing file names:Sequential numbers
    Select image name for file prefix:RFP
    Enter single file name:\\\\g<WellID>_Nucleoli_
    Do you want to add a suffix to the image file name?:Yes
    Text to append to the image name:_GreenFiltNucRings
    Select file format to use:tiff
    Output file location:Elsewhere...\x7C/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/Batch1/KF_pipeline_test_output
    Image bit depth:8
    Overwrite existing files without warning?:No
    Select how often to save:Every cycle
    Rescale the images? :No
    Save as grayscale or color image?:Grayscale
    Select colormap:gray
    Store file and path information to the saved image?:Yes
    Create subfolders in the output folder?:No

ExportToDatabase:[module_num:17|svn_version:\'11418\'|variable_revision_number:22|show_window:True|notes:\x5B\x5D|batch_state:array(\x5B\x5D, dtype=uint8)]
    Database type:MySQL / CSV
    Database name:Nop10DB
    Add a prefix to table names?:Yes
    Table prefix:Expt_NOP10
    SQL file prefix:SQL_
    Output file location:Elsewhere...\x7C/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/Batch1/KF_pipeline_test_output
    Create a CellProfiler Analyst properties file?:Yes
    Database host:
    Username:
    Password:
    Name the SQLite database file:DefaultDB.db
    Calculate the per-image mean values of object measurements?:Yes
    Calculate the per-image median values of object measurements?:Yes
    Calculate the per-image standard deviation values of object measurements?:Yes
    Calculate the per-well mean values of object measurements?:No
    Calculate the per-well median values of object measurements?:No
    Calculate the per-well standard deviation values of object measurements?:No
    Export measurements for all objects to the database?:All
    Select the objects:
    Maximum # of characters in a column name:64
    Create one table per object or a single object table?:Single object table
    Enter an image url prepend if you plan to access your files via http:
    Write image thumbnails directly to the database?:No
    Select the images you want to save thumbnails of:
    Auto-scale thumbnail pixel intensities?:Yes
    Select the plate type:None
    Select the plate metadata:None
    Select the well metadata:None
    Include information for all images, using default values?:Yes
    Properties image group count:1
    Properties group field count:1
    Properties filter field count:0
    Workspace measurement count:1
    Experiment name:Expt
    Which objects should be used for locations?:Nucleoli
    Select an image to include:None
    Use the image name for the display?:Yes
    Image name:None
    Channel color:gray
    Do you want to add group fields?:No
    Enter the name of the group:
    Enter the per-image columns which define the group, separated by commas:ImageNumber, Image_Metadata_Plate, Image_Metadata_Well
    Do you want to add filter fields?:No
    Automatically create a filter for each plate?:No
    Create a CellProfiler Analyst workspace file?:No
    Select the measurement display tool:ScatterPlot
    Type of measurement to plot on the x-axis:Image
    Enter the object name:Image
    Select the x-axis measurement:
    Select the x-axis index:ImageNumber
    Type of measurement to plot on the y-axis:Image
    Enter the object name:Image
    Select the y-axis measurement:
    Select the y-axis index:ImageNumber

CreateBatchFiles:[module_num:18|svn_version:\'11372\'|variable_revision_number:5|show_window:True|notes:\x5B\x5D|batch_state:array(\x5B\x5D, dtype=uint8)]
    Store batch files in default output folder?:Yes
    Output folder path:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/Batch1/KF_pipeline_test_output
    Are the cluster computers running Windows?:No
    Hidden\x3A in batch mode:Yes
    Hidden\x3A in distributed mode:No
    Hidden\x3A default input folder at time of save:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset
    Hidden\x3A SVN revision number:11418
    Local root path:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset
    Cluster root path:/home/morphology/Morphology/Images/Screens_Other/10_NOP10GH_HTA2mCN_can1RPL39prtdTU/CPClusterTest/CP2_Test/FlexFiles_Meas01_subset

Hi Lee,

Thanks, this works now for me with measurements.py version 11435.

Lee.

Hi again folks,

After trying this within my job submission script, there are a few problems that I had not anticipated. Importing all of measurements.py is a lot of work, as it has many large dependencies. I just want to interrogate the h5 batch file to find the number of image sets in my batch so that I may distribute the load over the cluster properly. For example, numpy is imported (twice, actually) which seems excessive. When I run something like

in a terminal over an ssh connection with X forwarding, the process attempts to open an X window (probably as a result of importing certain CP modules), and it takes quite a while to complete. I notice that get_image_numbers just queries an hdf5_dict object:

image_numbers = np.array( self.hdf5_dict.get_indices(IMAGE, IMAGE_NUMBER), int)
but even the h5py_dict class is tightly coupled to numpy. I think it may easier and faster for my purposes to get the results directly from the h5py.File object, even if this approach is more brittle and less elegant.

Thanks for the support all the same. I really appreciate all your help!

Lee.

I think you’re right about using h5py directly instead of trying to use CellProfiler’s code. It’s very likely that importing Measurements and possibly even hdf5_dict would impose too many dependencies on your application. Here’s some code to do that (if you use the latest measurements format):

>>> import h5py
>>> f = h5py.File(r"data.h5", "r")
>>> measurements_group = f"Measurements"]
>>> last_measurements_group = sorted(measurements_group.keys())-1]
>>> last_measurements_group
'2011-08-22-17-08-18'
>>> ds = measurements_group[latest]"Image"]"ImageNumber"]"data"]
>>> ds.value
array( 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
       35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51,
       52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64], dtype=int64)

Caveat emptor: the format of the hdf5 file could change but we think that it is set. Eventually, we may store more than one set of measurements in a file. For now, they are named by date (‘2011-08-22-17-08-18’ for this one), but the user will be able to name them explicitly at some point.

–Lee

Thanks Lee, this should do it I think.

[quote=“LeeKamentsky”]I think you’re right about using h5py directly instead of trying to use CellProfiler’s code. It’s very likely that importing Measurements and possibly even hdf5_dict would impose too many dependencies on your application. Here’s some code to do that (if you use the latest measurements format):

>>> import h5py
>>> f = h5py.File(r"data.h5", "r")
>>> measurements_group = f"Measurements"]
>>> last_measurements_group = sorted(measurements_group.keys())-1]
>>> last_measurements_group
'2011-08-22-17-08-18'
>>> ds = measurements_group[latest]"Image"]"ImageNumber"]"data"]
>>> ds.value
array( 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
       35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51,
       52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64], dtype=int64)

Caveat emptor: the format of the hdf5 file could change but we think that it is set. Eventually, we may store more than one set of measurements in a file. For now, they are named by date (‘2011-08-22-17-08-18’ for this one), but the user will be able to name them explicitly at some point.

–Lee[/quote]