Training in low-resolution > inference high-resolution (bad pose estimations!?)

Hi everyone,

I’ve trained my network in low-resolution (640x360) videos. When I use it for pose estimations in same-resolution files it does well.
However, when I try to infer poses in 1280x720 videos it gets a mess…
Does anyone have an idea how to resolve this issue?

deeplabcut 2.1.3 (Linux, Ubuntu 16.04)
crop = True
tried with and without dynamic cropping (similar results)



Dear Rodrigo,

that makes sense! Ideally you train the network with similarly sized frames.

If you need broad generalization, you can change the size parameters during augmentation, i.e. scale_jitter_up: 2.5. (see Box 2 Parameters of interest in the network configuration file, pose_cfg.yaml in


Thanks a lot!
I’ll then try playing with scale_jitter_up and see.


