Convert DLC pose_net inference to Pytorch

Hi at all, I’m trying to convert the dlc model from Tensorflow to Pytorch (it seems that it can be run much more faster on Pytorch). For doing that I miss only the final part that from the prediction of the deconvolution layer extract the coordinates of the poses.

In simply terms of codes is this set of Tensorflow operations:

probs = tf.sigmoid(deconvolution_tensor)
probs = tf.transpose(probs, [1, 2, 0, 3])
l_shape = tf.shape(probs)
probs = tf.reshape(probs, (l_shape[0] * l_shape[1], -1))
maxloc = tf.argmax(probs, axis=0)
loc = tf.unravel_index(maxloc, (tf.cast(l_shape[0], tf.int64), tf.cast(l_shape[1], tf.int64)))
maxloc = tf.reshape(maxloc, (1, -1))
joints = tf.reshape(tf.range(0, tf.cast(l_shape[2] * l_shape[3], dtype=tf.int64)), (1, -1))
indices = tf.transpose(tf.concat([maxloc, joints], axis=0))

I managed to get something, but the shapes of the result is differents at a moment and so the result is not correct.

Any suggestion about doing this work?

p.s. The first not beautiful idea that works is to use Tf for only this part and converting the output from the Pytorch to numpy before sending it into the Tf graph, of course is not the best option for improving the inference speed.

Thank you.

Hi @ceradini we do have a pytorch in the works (deeplabcut-pytorch · PyPI ) but it does not run faster ;). If you are interested in helping develop this, we would be really happy for that! do get in touch: admin@deeplabcut.org

Hi @ceradini we do have a pytorch in the works (deeplabcut-pytorch · PyPI ) but it does not run faster ;). If you are interested in helping develop this, we would be really happy for that! do get in touch: admin@deeplabcut.org

Hi @MWMathis , I’m writing to you an email to explain better, but I tested the same ResNet model using Pytorch and Tensorflow on a Jetson TX2 (that’s why the performances are so important for me) and it was evinced that Pytorch can run the same model with circa 0.5 of the time of Tensorflow. Given that DLC use ResNet behind (unless you use MobileNet of course) I have reasons to think that the time improvement will remain.

That’s why I’m interested in implementing dlc with Pytorch, and the most difficult part is given by traducing the operations above in to the Pytorch counterpart.

In case that anyone needs a solution, this is my code to solve the part prediction extraction with Pytorch and numpy instead of tensorflow (assuming batch-size equal to 1, the other case is very similar):

probs = nn.sigmoid(result)
probs = np.transpose(probs.detach().cpu().numpy(), [0, 3, 2, 1])
probs = np.squeeze(probs, axis=0)
l_shape = np.shape(probs)
probs = np.reshape(probs, (l_shape[0] * l_shape[1], -1))
maxloc = np.argmax(probs, axis=0)      
loc = np.unravel_index(maxloc, (l_shape[0], l_shape[1]))
maxloc = np.reshape(maxloc, (1, -1))
joints = np.reshape(np.arange(0, l_shape[2]), (1, -1))
indices = np.transpose(np.concatenate([maxloc, joints], axis=0))