I’m striving to understand exactly how the image augmentation parameters interact with the maximum and minimum input sizes for training a DLC model.
Specifically, what is the order in which the cropping, scaling, and max size check are conducted? I have been assuming that it goes:
- Scale frame using “global_scale”
- Crop around points using “minsize” and “(bottom/top)height” & “(left/right)width”
- Scale the cropped frame using “scale_jitter_(lo/hi)” params
- Check to make sure the scaled frame size does not exceed “max_input_size” squared
However, I have been encountering issues in which training freezes partway through, with GPU usage maxed out. This makes me wonder if a too-large frame has snuck through. Does the size check perhaps later in the sequence?