Metadata and image's names

Hi,

Good Afternoon.

I have getting problems with the metadata module in my pipeline. The regular expression that I have used to extract metadata from the file name was incorrect. But my images names are only numeric ( 002002- 1-001001001.tif), so i dont know how match the expression with them.

My research group work with high- throughput screening. We have to analyze a large volume of images. And those images were imported directely from Columbus Sistemy with a numeric name.

My goal is create a properties file to tranfer analyzes to CPA. Please find the pipeline and a sample csv file here.

Pipeline_tcruzi_exp29_p1_exem.cpproj (120.3 KB)

Looking forward to interesting solutions.

Thanks a lot,

Isabela.

Hi @Isabela_Possebom,

We’re happy to help!

Your regular expression didn’t match because the separators between numbers was an underbar _ in the regular expression but a hyphen - in your filename. You also had more categories than the three that are present in your filenames. Here’s an expression that matches:

^(?P<Plate>[0-9]*)-(?P<Channel>[0-9]*)-(?P<Row>[0-9]*).tif

It allows any number of repeats (designated by the *) of the digits 0-9. I named your three different options Plate, Channel, and Row, but you can adjust that as needed.

Hope this helps!
Pearl

Hi @pearl-ryder,

Thank you so much. I did change the regular expression, but a had others problems.

I have trying to open de plate view in CPA with my propreties file and occurs a error and this message appears.

An error occurred in the program:
TypeError: ‘NoneType’ object has no attribute ‘getitem

Traceback (most recent call last):
File “CellProfiler-Analyst.py”, line 267, in launch_plate_map_browser
File “cpa\plateviewer.pyc”, line 176, in init
File “cpa\plateviewer.pyc”, line 372, in OnSelectMeasurement
File “cpa\plateviewer.pyc”, line 275, in UpdatePlateMaps
File “cpa\plateviewer.pyc”, line 711, in FormatPlateMapData
File “cpa\datamodel.pyc”, line 310, in get_well_position_from_name
File “cpa\datamodel.pyc”, line 290, in populate_plate_maps

In Pearl’s suggested string, you no longer have a piece of well metadata extracted; that’s fine if you’re not using the Plate Viewer tool, but definitely a problem if you do want to, since you no longer know for each image which well it came from.

If you want to fix your expression, the help for “Regular expression to extract from file name” is pretty detailed and I definitely recommend reading it carefully, but to summarize, for each thing you want to extract you need to set up 3 things in the regex:

  1. What that parameter should be called
  2. Which characters are allowed to be used
  3. How many characters are allowed to be used.

So if, say, 002002 was your plate name, I could capture it with a regex like (?P<Plate>[0-2]{6}), which says that 1) I have a thing I want to capture called Plate 2) It has the digits 0-2 in it and 3) It has 6 digits

  • I could also change the second part instead of [0-2] to [012] (any character 0,1, or 2) or [0-9] (any digit 0-9) or [0-9A-Za-z] (any digit or any capital letter or any lower case letter) or even just . (which means “any character including spaces and punctuation”)
  • I could change the third part from {6} to {5,6} (5 or 6 digits allowed), {4-8} (4, 5, 6, 7, or 8 digits allowed), or even just * (which means “as many characters as you want”)

If 002002 were instead your well name (Row 2, column 2), I could capture it as (?P<WellRow>[0-9]{3})(?P<WellColumn>[0-9]{3}) ; as the help describes, CellProfiler treats WellRow and WellColumn as “special”, so if you extract both of those in your regular expression you’ll get a magical extra piece of metadata, Well, which for that extraction string it will automatically translate to B02. Lots of people want to extract the well from file names that may or may not be all numbers, so we’ve added that to make it easier!

1 Like

Good morning @bcimini,

Thank you so much for the answer, I understood. I changed the regular expression, and I put the two variables WeelColumn and WellRow.

But, when I opened de plate view in CPA with the new propreties file occurs a another error and this message appears.

An error occurred in the program:
IndexError: index 16 is out of bounds for axis 0 with size 16

Traceback (most recent call last):
File “CellProfiler-Analyst.py”, line 267, in launch_plate_map_browser
File “cpa\plateviewer.pyc”, line 175, in init
File “cpa\plateviewer.pyc”, line 203, in AddPlateMap
File “cpa\platemappanel.pyc”, line 72, in init

Thanks,

Isabela.

Can you post your updated pipeline, and ideally the properties file and database too? There are a few different places where things might be going wrong, that will help us narrow it down.

I figured out how to solve the problem. Thank you for now

Very glad to hear it! If you don’t mind, it would be great to know what the issue was so that anyone who comes across the same error message in the future has an idea of where to start.

Good luck with your analysis!