Disappearing labels: an issue with training strategy or simply more training required?

Hi all,

I am experiencing issues with labels disappearing frequently while tracking mice from top-view with single-animal DLC2.1.9 (with p_cutoff 0.8 and 0.6). In some frames only certain labels disappear, whereas in some others the whole animal is not labelled. This seems strange to me as I’ve now reached 1.86px test error which is more than enough for my needs. Additionally, when the animal is fully labeled, the network predictions are accurate; stranger still, the labels sometimes disappear between similar frames - one is fully labeled and correct, the other unlabeled.

Here is my training strategy so far (very briefly):

  1. We are tracking single mice in black-and-white videos captured at 500x375 resolution at 2 fps. The mice are filmed in their cages, so the environment is pretty cluttered (straw, running wheel, food pellets, etc.). Also, some videos are darker than others, reflecting the difference between filming at night and day, but the camera angle is always the same.
  2. I began retraining ResNet101 by choosing 20 videos from different cages (10 light, 10 dark) and manually extracting 10 frames from each for 200 initial training frames. I trained the first iteration on 3 shuffles (95% train-test split) for 200k iterations using batch of 8 and ADAM optimizer. Additionally, I set intermediate_supervision: true and left intermediate_supervision_layer=12 as default.
  3. For data augmentation I used imgaug and the following settings in pose_cfg.yaml file (in all trainings), leaving the rest as default:
rotation: 180
rotratio: 0.4
scale_jitter_lo: 0.5
scale_jitter_up: 1.25
elastic_transform: true
covering: true
motion_blur: true
  1. For subsequent retrainings, I always chose the best network snapshot from one shuffle (whichever had the lowest test error) after the learning loss plateud, analyzed 10 new videos and manually chosen 10 frames from each, keeping the labeled poses as different as possible. Then I retrained from this snapshot with the expanded dataset (the old training frames from previous iterations + the 100 new frames/iteration) for 40-50k iterations (until the loss plateud) and repeated the process.

I reached the current stage (1.86px test error) after 3 training iterations and 500 training frames. Do the disappearing labels mean that the network is simply not robust enough yet and I should continue the above process (despite the low test error), or is there anything off with the training strategy I am using?

I would be really grateful for any insights!