Internal Error while attempting to train network (input shape problem)

Hello,

I am using DLC 2.2b7 and have a newly labelled dataset and am ready to train. I have also been using the GUI for the first time which has made things much simpler, but I have encountered an error that I have never before using DLC for pose estimation:

InternalError (see above for traceback): cuDNN launch failure : input shape([1,3,257,469]) filter shape([3,3,3,32])
[[{{node MobilenetV2/Conv/Conv2D}} = Conv2D[T=DT_FLOAT, data_format=“NCHW”, dilations=[1, 1, 1, 1], padding=“SAME”, strides=[1, 1, 2, 2], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](gradients/MobilenetV2/Conv/Conv2D_grad/Conv2DBackpropFilter-0-TransposeNHWCToNCHW-LayoutOptimizer, MobilenetV2/Conv/weights/read)]]
[[{{node add/_605}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name=“edge_1843_add”, tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

From others with similar error messages I have seen that the config path name can be a possible issue when this error arises. However, there are no special characters and my path name does not seem to be too long as I understand, “C:/Users/jquin/Desktop/q_ladder_DLC-Quinn-2020-09-01”.
Can someone help me understand this issue and find a possible solution? @jeylau @MWMathis
Thank you!

hi @Quinn - glad you like the GUI! Did you run cropandlabel before creating a training set? Sometimes this error pops up with the images are too big /GPU memory is not enough for the settings we now use (hence the cropandlabel step).

In order to cropandlabel before creating a training set do I just need to change the cropping parameters to true in the config file? I tried this and still run into the same error when I generated the training dataset in the config file:

Cropping Parameters (for analysis and outlier frame detection)

cropping: true
croppedtraining: true

Ah, now I see that I have to crop the frames back in frame extraction. Which I suppose means I have to relabel the new cropped frames. Is that correct?

I have selected to crop the frames and redid all of the labels, but the same error message persists. Is there somewhere else that I need to indicate cropping during the training itself?

no no, there is a function that you just run before you create the training dataset: https://github.com/DeepLabCut/DeepLabCut/blob/a3ff917fc04ae0199b051b1d0cc2f75c3700191a/docs/UseOverviewGuide.md#create-training-dataset

^ this is only for video analysis after you have a network trained

in general searching within github yields a lot within the repo as well! https://github.com/DeepLabCut/DeepLabCut/search?q=cuDNN+launch+failure&type=Issues

^ perhaps these help. It is likely related to memory size (thus you should crop) or incorrect tensorflow/GPU/driver versions

Thank you for the input! I have used the cropping function to get through most of the data that I would like to train on. However, for some reason with videos ‘2’ and ‘20’ it is throwing the following error:

FileNotFoundError: [WinError 3] The system cannot find the path specified: ‘C:/Users/jquin/Desktop/q_ladder_DLC-Quinn-20_cropped20_cropped-09-01\labeled-data\20_cropped’

This is strange to me because it is in the same parent folder as all of the other labelled datasets. Any thoughts and how to fix this issue? I would like to not have to arbitrarily exclude these data!
Thank you again!

@MWMathis
I think I am a bit confused about whether I will be able to use the cropimagesandlabels to address the original problem. My images are pretty large so there is not much that I can crop out without losing the labelled portion of the image. Is there another solution to this issue? Or have I misunderstood the purpose of the cropping function? When I tried training on the cropped data that was available, I noticed there were no labels in the h5 files within the cropped images folders, and when trying to create the training dataset I would get the error: ValueError: cannot set WRITEABLE flag to True of this array
I assume this is because the h5 files had no values for the labels of each pose, since it was cut off in the cropping process. Can you suggest another way forward, or perhaps there is something that needs to be updated that could be the fix?

Thanks again!

yes, the cropandlabel function we wrote will crop the data for training, smartly. It will not throw out your data.