DLC settings for maximum accuracy


as already discussed last December with @MWMathis and @AlexanderMathis, I am using DLC for human pose estimation (golf swing). The aim is to achieve maximum accuracy in detecting the joints of the person. Therefore I am looking for the perfect settings in config.yaml and pose_cfg.yaml. Currently i use following:


  • TrainingFraction: 0.8 as in the cheetah project
  • batch_size: 1


  • global_scale: 1
  • init_weights: mpii-single-resnet-101
  • intermediate_supervision: true
  • intermediate_supervision_layer: 12 (does a different layer # might result in more accurate results?)
  • mirror: true

Does cluster_color=False/True of deeplabcut.extract_frames() has an impact of the accuracy? maybe gets more information when turning cluster_color=True?

Is there something else to consider when trying to achieve maximum accuracy?

config_MaxAcc.txt (1.6 KB)
pose_cfg_MaxAcc.txt (1.8 KB)

The network parameters that will most change performance are pos_dist_threshold (default is 17), global_scale (default is 0.8), resnet (i.e. network depth: 50 or 101), and crop = True + cropratio =.4 --> this is a brand new augmentation step - which will be better documented when the paper comes out (in press now!) - that works very well (see panel b here). You could increase the cropratio if you want, and keep as true (which is the default). If your images are large, be sure you change the max input size to account for scaling (it goes from 0.5 to 1.25 by default, so that means if your images are 1000 pixels, you need your max input to be at least 1250 to account for the scaling).

But, the biggest factor will be really good, error free, labels from a diversity of settings (like the cheetah example)! :slight_smile:

Re: your question-> For extracting frames, color=True also uses color to cluster, so if you have diverse colors, then set this to True to cluster the frames using this feature before extraction.

1 Like

Thanks so much for answering @MWMathis and looking forward to the details of the brand new augmentation steps :slight_smile:

The info is currently in the code, just not in the readme docs: https://github.com/AlexEMG/DeepLabCut/blob/3b10ea5bbb4cdba6ee6c0cc2481f4058f71fe5da/deeplabcut/pose_cfg.yaml#L35 here you go! and the quantification is above (and as you saw with the mice, it’s slightly better even on less challenging things than cheetahs, but really help the challenging applications)

1 Like

When using extract_frames() with the kmeans option, the model says “Extracting and downsampling…”. DLC only downsamples the image for the kmeans the selection and its keeping the maximum resolution of the extracted frame, right?

Correct, but of course you can look at the frame pixel size that is extracted vs your video ;). They are the same, unless you use cropping.

1 Like