Hi @MWMathis & @AlexanderMathis,
i created a big dataset for my golf project with really good, error free labels as you mentioned above. My brother helped annotating the dataset too, so i have 2 sets of annotations of the same images as you stated in your Neuroscience paper. i computed the human variability (RMSE of mine vs. my brothers annotations) and its around 4 pixel as its not as easy to define where joints are located when wearing loose clothes. I trained a model with my annotations only and trained a model with my brothers & mine annotations. In order to compare, i trained multiple shuffles with different parameter settings, but obviously the test error of the model with my annotations only was always lower. It makes sense to have multiple annotations in order to decrease the human variability, but practically it decreases the performance of the model. As i want to work scientifically correct, my question to you: Would it be sufficient to train the model with my annotations only (as I am the “expert” in golf and detecting joints in loose clothes ;)) and work with a test set which is annotated by ~5 different people (to increase human variability)?
Additionally I split train and test set by separating the people, so that the same person is not included in both train and test set. Of course the test error would decrease when including the same person in the train and test set. Whats the scientifically correct way of splitting the data set in your eyes?
By the way, the pretrained resnet-152 performs much worse than the resnet-101… I am wondering why but i will stick to 101
I am looking forward to show you my work i have done with DLC.
Thank you so much!