FYI, DeepLabCut worked on Radeon GPU & RTX3080 (Using tensorflow-directml)


I successfully run DeepLabCut using a Radeon GPU RX6900XT(and RTX3080 on 2.2b8)!

The way to do this is using tensorflow-directml package developed by Microsoft, which uses DX12 to run Tensorflow.

tensorflow-directml is now equivalent for TF1.5.


It seems slower than native CUDA tensorflow, but faster than CPU!


  1. tensorflow-directml package can run on Windows 10(or WSL2 linux environment, but I haven’t tested)
  2. Making python 3.6.8 virtual environment and enter to it.(like python -m virtualenv dlc-directml)
  3. pip install tensorflow-directml
  4. pip install wxpython==4.0.7
  5. pip install deeplabcut==2.2b8
  6. Run deeplabcut GUI!

Rough benchmarks:
It measured at same dataset(single animal, 1280x720 zebrafish movie, 20 labeled data with 4bpts), but there are difference about TF versions, operating systems and CPU.

CPU(windows10, Ryzen 5600X): 2.38 s/iter
K40M(linux, native CUDA, i5-6400): 0.337 s/iter
RTX3080(windows10, directml, 5600X): 0.138 s/iter
RX6900XT(windows10, directml, 5600X): 0.265 s/iter

RTX3080 and RX6900XT work faster than K40M. AMAZING!!

Inference is also working with Radeon, but it seems slow.

tensorflow-directml is just pre-release yet, but very interesting project.
If this package becomes stable, we won’t have to think about compatibility of GPUs or CUDA/cuDNN versions…

It don’t become a reason to actively use a Radeon, but useful for those who already have it to try out the DLC.
I hope this post will be helpful to someone!


thank you! Very useful :slight_smile:

1 Like