I have successfully trained a network with 5 videos that are all fairly similar in background/focus etc and had the train pixel error down to 3 pixels and test pixel error at 6. Each video had ~150 labeled frames and ~4,000 frames total. However, when I added a new video, still with the same background/focus and basic behavior, the labels weren’t very accurately. I went on to extract 50 outlier frames and label them, but when I returned the results, my p-cutoff increased from 0.1 to 0.4 which seems like a red flag to me.
Do people have experience with training on many videos? Is there a point at which you have been able to just add new videos without training and gotten accurate results? What could be the reasoning for the p-cutoff to increase so much?