Problems with regular expression

Hi,

First apologies if this is covered elsewhere. I have not been able to find a fix to my issue in the other threads.

I have acquired a set of images in a 384 well format, 3 images per well, 1 channel (DAPI) on a Nikon microscope.

The files are exported from nd2 files to tiffs in the following format
Count00000_WellC03_ChannelHQ2 DAPI_Seq0000xy1.tif
Count00000_WellC03_ChannelHQ2 DAPI_Seq0000xy2.tif
Count00000_WellC03_ChannelHQ2 DAPI_Seq0000xy3.tif
Count00000_WellC04_ChannelHQ2 DAPI_Seq0000xy1.tif
Count00000_WellC04_ChannelHQ2 DAPI_Seq0000xy2.tif
Count00000_WellC04_ChannelHQ2 DAPI_Seq0000xy3.tif

Cellprofiler cannot guess the regular expression, and from other posts I have been using regex101.com to test out other expressions and I still cannot seem to get it. From the file name I would like to extract the cell number, channel and the position. The channel will always be ChannelHQ2 then either DAPI/GFP/RFP etc.

.*(?P#Well#Well[A-P][0-9]{2})(?P<#Channel#ChannelHQ2 DAPI)(?P#Site#Seq0000xy[1-3])
*please not that <> have been changed to ## to surround the phrase Well/Channel/Site due to how the forum formats the string

Any Help would be greatly appreciated,
Jacob

Hi,
The underscores “_” are missing in your regex.
Try this:
.*(?P<Well>Well[A-P][0-9]{2})_(?P<Channel>ChannelHQ2 .*)_(?P<Site>Seq0000xy[1-3])
Or something like this:
.*Well(?P<Well>[A-P][0-9]{2})_ChannelHQ2 (?P<Channel>.*)_Seq0000(?P<Site>xy[1-3])
Good luck
Fab

Hi fabienk,

Thanks for the help. In my expression I did have the underscores but they were also formatted out in the post but I didn’t notice (how did you insert yours without it changing the format?).

The first one seems to have done the trick though- although apart from the channelh2q .* I am not sure what I had differently. I must have had a bracket in backwards or something.

Thank again,
Jacob