"Ghost" files taking storage space in OMERO

Dear OME team,

Something curious is happening to one of our users at EPFL:
He wanted to upload a huge file (New.czi) to our OMERO server. He tried once and the import crashed before the end of the upload, but then he tried again, and at this second attempt, the file was successfully imported to OMERO. New.czi's size is 275GB, and according to OMERO.web and OMERO.insight, this user does not own many more files on his account (only an additional project which is about 200MB)
image

But when I check the users statistics in OMERO.web, it says that the total disk usage of this user is 554.09GB:

I’m afraid that maybe the first import attempt of New.czi worked, and that the data is stored somewhere, but we can’t manage to access this file: The orphaned images folder from this user is empty, and he does not own images in a different group than his default one. He does not have admin rights, so he is not able to upload images for different users than himself.

We tried the following command to search the ghost New.czi file:

omero  admin cleanse --dry-run /data/omero

We obtained many files in the output list, but none of them belong to the owner of the New.czi file.

Would you have suggestions to help us retrieve and delete this “ghost” file? (And even better, to prevent this scenario from happening again?)

Many thanks in advance,
Claire

Hi Claire,

Can you try running:

omero fs sets --without-images

and show us the output?

If these items do show up in the above, then that could pretty easily be added to admin cleanse to make cleaning up easier. Having the clients cleanup fileset son a failed import would prevent the issue, but that would take more time to implement.

~Josh.

1 Like

Hi Josh,

Thank you very much for your answer.

Here is the output of omero fs sets --without-images:

Created session for root@localhost:4064. Idle timeout: 10 min. Current group: system
#  | Id    | Prefix                                | Images | Files | Transfer
----+-------+---------------------------------------+--------+-------+----------
0  | 11834 | zanolett_409/2021-01/26/05-23-31.470/ | 0      | 1     |
1  | 11312 | zanolett_409/2021-01/25/09-35-18.534/ | 0      | 1     |
2  | 11170 | ilambert_602/2021-01/22/14-20-46.399/ | 0      | 361   |
3  | 11027 | eoerdoeg_412/2021-01/22/00-18-15.679/ | 0      | 1     |
4  | 10422 | cstoffel_354/2021-01/18/14-20-40.634/ | 0      | 2     |
5  | 10421 | cstoffel_354/2021-01/18/14-20-14.560/ | 0      | 2     |
6  | 10420 | cstoffel_354/2021-01/18/14-15-48.247/ | 0      | 2     |
7  | 10362 | cstoffel_354/2021-01/15/16-04-47.548/ | 0      | 2     |
8  | 10361 | cstoffel_354/2021-01/15/16-04-38.202/ | 0      | 2     |
9  | 10360 | cstoffel_354/2021-01/15/16-03-36.322/ | 0      | 2     |
10 | 10359 | cstoffel_354/2021-01/15/15-19-19.716/ | 0      | 2     |
11 | 10303 | cstoffel_354/2021-01/14/15-46-48.754/ | 0      | 2     |
12 | 10300 | cstoffel_354/2021-01/13/16-23-50.507/ | 0      | 2     |
13 | 10299 | cstoffel_354/2021-01/13/16-00-33.663/ | 0      | 2     |
14 | 10216 | cstoffel_354/2021-01/12/17-05-46.554/ | 0      | 2     |
15 | 10215 | cstoffel_354/2021-01/12/16-48-11.151/ | 0      | 2     |
16 | 10214 | cstoffel_354/2021-01/12/13-55-27.555/ | 0      | 2     |
17 | 10209 | cstoffel_354/2021-01/12/13-21-30.786/ | 0      | 2     |
18 | 10208 | cstoffel_354/2021-01/12/13-08-42.033/ | 0      | 2     |
19 | 10207 | cstoffel_354/2021-01/12/13-05-50.659/ | 0      | 2     |
20 | 10206 | cstoffel_354/2021-01/12/12-59-55.493/ | 0      | 2     |
21 | 10184 | cstoffel_354/2021-01/07/16-28-44.369/ | 0      | 2     |
22 | 9994  | cstoffel_354/2021-01/06/16-54-56.124/ | 0      | 289   |
23 | 9993  | cstoffel_354/2021-01/06/16-45-39.730/ | 0      | 289   |
24 | 9992  | cstoffel_354/2021-01/06/16-43-28.218/ | 0      | 289   |
(25 rows, starting at 0 of approx. 9175)
(venv3) omero@omero:~$

Unfortunately, I can only see the 25 latest entries, and the failed import of the New.czi file was done on december 13th. However, it seems to display my failed imports of bigdataviewer files (see this post).

How can I proceed to delete the files from this omero fs sets --without-images output?

Thank you very much,
Claire

You can dump the results like this I believe
omero fs sets --without-images --limit 10000 --style=csv > creepy_ghosts.csv

From this documentation it seems that the limit is set to 25 by default

Then you can copy that CSV file somewhere and explore it in more detail (or post it here)

2 Likes

@oburri is of course correct. The critical bit for getting rid of them is the Id column. You can use:

omero delete --dry-run --report Fileset:10422

to see what would be deleted from your latest failed import. Or if you are confident of what is being removed, drop the --dry-run part. :wink:

~Josh

1 Like

Thank you @oburri , by increasing the limit I was able to retrieve the New.czi “ghost” file in the output of
omero fs sets --without-images --limit 10000 --style=csv

We then used omero delete --dry-run --report Fileset:9107 to delete the corresponding fileset and it worked, according to the storage space taken by the user:

It would be great indeed if the listed files could be added to the admin clease! Is there currently a way to delete every files contained in the omero fs sets --without-images at once?

Claire

1 Like

@stoffelc,

I’ve filed an issue: RFE: add `fs sets --without-images` to `admin cleanse` · Issue #277 · ome/omero-py · GitHub

The delete command will take multiple filesets, e.g.:

$echo omero delete $(awk -F',' 'FNR>1 {print "Fileset:"$2}' /tmp/ghosts.csv)
omero delete Fileset:123 Fileset:456 ...

~Josh

2 Likes