Setting up on Debian with RTX 2080

We are struggling to get Deeplabcut (latest version from pip) running on a server equipped with a GPU: Debian (buster), Nvidia RTX 2080 Super, driver 440.82, CUDA toolkit 10.0 (installed on OS), python 3.6, TF 1.13.2, CUDA (conda) 10.0, cudnn 7.6.5 (package matches CUDA version). When we start training, we immediately get a cudnn launch error referencing filter shape. We have tried varying the versions of TF and cudnn, but routinely get this error. We also tried downgrading to CUDA 9.2, but no luck. Matlab 2019b has no problem using the GPU, and I have also successfully run a short matrix multiplication script on the GPU with TF. Any suggestions?

Here are the installed packages:

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
absl-py                   0.9.0                    pypi_0    pypi
astor                     0.8.1                    pypi_0    pypi
backcall                  0.1.0                    py36_0  
ca-certificates           2020.1.1                      0  
cairo                     1.14.12              h8948797_3    anaconda
certifi                   2020.4.5.1               py36_0  
chardet                   3.0.4                    pypi_0    pypi
click                     7.1.2                    pypi_0    pypi
cudatoolkit               10.0.130                      0  
cudnn                     7.6.5                cuda10.0_0  
cycler                    0.10.0                   pypi_0    pypi
decorator                 4.4.2                      py_0  
deeplabcut                2.1.8.2                  pypi_0    pypi
easydict                  1.9                      pypi_0    pypi
expat                     2.2.6                he6710b0_0    anaconda
fontconfig                2.13.0               h9420a91_0    anaconda
freetype                  2.9.1                h8a8886c_1    anaconda
fribidi                   1.0.5                h7b6447c_0    anaconda
gast                      0.3.3                    pypi_0    pypi
gettext                   0.19.8.1             h9b4dc7a_1    anaconda
glib                      2.56.2               hd408876_0    anaconda
graphite2                 1.3.13               h23475e2_0    anaconda
grpcio                    1.29.0                   pypi_0    pypi
gst-plugins-base          1.14.0               hbbd80ab_1    anaconda
gstreamer                 1.14.0               hb453b48_1    anaconda
h5py                      2.10.0                   pypi_0    pypi
harfbuzz                  1.8.8                hffaf4a1_0    anaconda
icu                       58.2                 he6710b0_3    anaconda
idna                      2.9                      pypi_0    pypi
imageio                   2.8.0                    pypi_0    pypi
imageio-ffmpeg            0.4.2                    pypi_0    pypi
imgaug                    0.4.0                    pypi_0    pypi
importlib-metadata        1.6.1                    pypi_0    pypi
intel-openmp              2020.0.133               pypi_0    pypi
ipython                   7.13.0           py36h5ca1d4c_0  
ipython_genutils          0.2.0                    py36_0  
jedi                      0.17.0                   py36_0  
joblib                    0.15.1                   pypi_0    pypi
jpeg                      9b                   habf39ab_1    anaconda
keras-applications        1.0.8                    pypi_0    pypi
keras-preprocessing       1.1.2                    pypi_0    pypi
kiwisolver                1.2.0                    pypi_0    pypi
ld_impl_linux-64          2.33.1               h53a641e_7  
libedit                   3.1.20181209         hc058e9b_0  
libffi                    3.3                  he6710b0_1  
libgcc-ng                 9.1.0                hdf63c60_0  
libglu                    9.0.0                hf484d3e_1    anaconda
libpng                    1.6.37               hbc83047_0    anaconda
libstdcxx-ng              9.1.0                hdf63c60_0  
libuuid                   1.0.3                h1bed415_2    anaconda
libxcb                    1.13                 h1bed415_1    anaconda
libxml2                   2.9.10               he19cac6_1    anaconda
markdown                  3.2.2                    pypi_0    pypi
matplotlib                3.0.3                    pypi_0    pypi
mock                      4.0.2                    pypi_0    pypi
moviepy                   1.0.1                    pypi_0    pypi
msgpack                   1.0.0                    pypi_0    pypi
msgpack-numpy             0.4.6.post0              pypi_0    pypi
ncurses                   6.2                  he6710b0_1  
networkx                  2.4                      pypi_0    pypi
numexpr                   2.7.1                    pypi_0    pypi
numpy                     1.16.4                   pypi_0    pypi
opencv-python             3.4.9.33                 pypi_0    pypi
openssl                   1.1.1g               h7b6447c_0  
pandas                    1.0.4                    pypi_0    pypi
pango                     1.42.4               h049681c_0    anaconda
parso                     0.7.0                      py_0  
patsy                     0.5.1                    pypi_0    pypi
pcre                      8.43                 he6710b0_0    anaconda
pexpect                   4.8.0                    py36_0  
pickleshare               0.7.5                    py36_0  
pillow                    7.1.2                    pypi_0    pypi
pip                       20.0.2                   py36_3  
pixman                    0.38.0               h7b6447c_0    anaconda
proglog                   0.1.9                    pypi_0    pypi
prompt-toolkit            3.0.5                      py_0  
prompt_toolkit            3.0.5                         0  
protobuf                  3.12.2                   pypi_0    pypi
psutil                    5.7.0                    pypi_0    pypi
ptyprocess                0.6.0                    py36_0  
pygments                  2.6.1                      py_0  
pyparsing                 2.4.7                    pypi_0    pypi
python                    3.6.10               h7579374_2  
python-dateutil           2.8.1                    pypi_0    pypi
pytz                      2020.1                   pypi_0    pypi
pywavelets                1.1.1                    pypi_0    pypi
pyyaml                    5.3.1                    pypi_0    pypi
pyzmq                     19.0.1                   pypi_0    pypi
readline                  8.0                  h7b6447c_0  
requests                  2.23.0                   pypi_0    pypi
ruamel-yaml               0.16.10                  pypi_0    pypi
ruamel-yaml-clib          0.2.0                    pypi_0    pypi
scikit-image              0.17.2                   pypi_0    pypi
scikit-learn              0.23.1                   pypi_0    pypi
scipy                     1.4.1                    pypi_0    pypi
setuptools                47.1.1                   py36_0  
shapely                   1.7.0                    pypi_0    pypi
six                       1.15.0                     py_0  
sqlite                    3.31.1               h62c20be_1  
statsmodels               0.11.1                   pypi_0    pypi
tables                    3.6.1                    pypi_0    pypi
tabulate                  0.8.7                    pypi_0    pypi
tensorboard               1.13.1                   pypi_0    pypi
tensorflow-estimator      1.13.0                   pypi_0    pypi
tensorflow-gpu            1.13.2                   pypi_0    pypi
tensorpack                0.10.1                   pypi_0    pypi
termcolor                 1.1.0                    pypi_0    pypi
threadpoolctl             2.1.0                    pypi_0    pypi
tifffile                  2020.6.3                 pypi_0    pypi
tk                        8.6.8                hbc83047_0  
tqdm                      4.46.1                   pypi_0    pypi
traitlets                 4.3.3                    py36_0  
urllib3                   1.25.9                   pypi_0    pypi
wcwidth                   0.1.9                      py_0  
werkzeug                  1.0.1                    pypi_0    pypi
wheel                     0.34.2                   py36_0  
wxpython                  4.0.4            py36hc99224d_0    anaconda
xz                        5.2.5                h7b6447c_0  
zipp                      3.1.0                    pypi_0    pypi
zlib                      1.2.11               h7b6447c_3 

And here’s the error:

Python 3.6.10 |Anaconda, Inc.| (default, May  8 2020, 02:54:21) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.13.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import deeplabcut                                                       

In [2]: deeplabcut.launch_dlc()                                                 

(ipython:8247): Gtk-WARNING **: 20:38:45.124: Error loading theme icon 'dialog-information' for stock: Unable to load image-loading module: /usr/lib/x86_64-linux-gnu/gdk-pixbuf-2.0/2.10.0/loaders/libpixbufloader-svg.so: /usr/lib/x86_64-linux-gnu/librsvg-2.so.2: undefined symbol: cairo_tag_end
/home/randy/Downloads/test5-randy-2020-06-13/training-datasets/iteration-0/UnaugmentedDataSet_test5Jun13  already exists!
/home/randy/Downloads/test5-randy-2020-06-13/dlc-models/iteration-0/test5Jun13-trainset95shuffle1  already exists!
/home/randy/Downloads/test5-randy-2020-06-13/dlc-models/iteration-0/test5Jun13-trainset95shuffle1/train  already exists!
/home/randy/Downloads/test5-randy-2020-06-13/dlc-models/iteration-0/test5Jun13-trainset95shuffle1/test  already exists!
The training dataset is successfully created. Use the function 'train_network' to start training. Happy training!
Config:
{'all_joints': [[0], [1], [2], [3]],
 'all_joints_names': ['bodypart1', 'bodypart2', 'bodypart3', 'objectA'],
 'batch_size': 1,
 'bottomheight': 400,
 'crop': True,
 'crop_pad': 0,
 'cropratio': 0.4,
 'dataset': 'training-datasets/iteration-0/UnaugmentedDataSet_test5Jun13/test5_randy95shuffle1.mat',
 'dataset_type': 'default',
 'deterministic': False,
 'display_iters': 1000,
 'fg_fraction': 0.25,
 'global_scale': 0.8,
 'init_weights': '/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/models/pretrained/resnet_v1_50.ckpt',
 'intermediate_supervision': False,
 'intermediate_supervision_layer': 12,
 'leftwidth': 400,
 'location_refinement': True,
 'locref_huber_loss': True,
 'locref_loss_weight': 0.05,
 'locref_stdev': 7.2801,
 'log_dir': 'log',
 'max_input_size': 1500,
 'mean_pixel': [123.68, 116.779, 103.939],
 'metadataset': 'training-datasets/iteration-0/UnaugmentedDataSet_test5Jun13/Documentation_data-test5_95shuffle1.pickle',
 'min_input_size': 64,
 'minsize': 100,
 'mirror': False,
 'multi_step': [[0.005, 10000],
                [0.02, 430000],
                [0.002, 730000],
                [0.001, 1030000]],
 'net_type': 'resnet_50',
 'num_joints': 4,
 'optimizer': 'sgd',
 'pos_dist_thresh': 17,
 'project_path': '/home/randy/Downloads/test5-randy-2020-06-13',
 'regularize': False,
 'rightwidth': 400,
 'save_iters': 50000,
 'scale_jitter_lo': 0.5,
 'scale_jitter_up': 1.25,
 'scoremap_dir': 'test',
 'shuffle': True,
 'snapshot_prefix': '/home/randy/Downloads/test5-randy-2020-06-13/dlc-models/iteration-0/test5Jun13-trainset95shuffle1/train/snapshot',
 'stride': 8.0,
 'topheight': 400,
 'weigh_negatives': False,
 'weigh_only_present_joints': False,
 'weigh_part_predictions': False,
 'weight_decay': 0.0001}
Switching batchsize to 1, as default/tensorpack/deterministic loaders do not support batches >1. Use imgaug loader.
Starting with standard pose-dataset loader.
Initializing ResNet
WARNING:tensorflow:From /usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/ops/losses/losses_impl.py:209: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
Loading ImageNet-pretrained resnet_50
WARNING:tensorflow:From /usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
INFO:tensorflow:Restoring parameters from /usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/models/pretrained/resnet_v1_50.ckpt
Max_iters overwritten as 1030000
Display_iters overwritten as 1000
Save_iters overwritten as 50000
Training parameter:
{'stride': 8.0, 'weigh_part_predictions': False, 'weigh_negatives': False, 'fg_fraction': 0.25, 'weigh_only_present_joints': False, 'mean_pixel': [123.68, 116.779, 103.939], 'shuffle': True, 'snapshot_prefix': '/home/randy/Downloads/test5-randy-2020-06-13/dlc-models/iteration-0/test5Jun13-trainset95shuffle1/train/snapshot', 'log_dir': 'log', 'global_scale': 0.8, 'location_refinement': True, 'locref_stdev': 7.2801, 'locref_loss_weight': 0.05, 'locref_huber_loss': True, 'optimizer': 'sgd', 'intermediate_supervision': False, 'intermediate_supervision_layer': 12, 'regularize': False, 'weight_decay': 0.0001, 'mirror': False, 'crop_pad': 0, 'scoremap_dir': 'test', 'batch_size': 1, 'dataset_type': 'default', 'deterministic': False, 'crop': True, 'cropratio': 0.4, 'minsize': 100, 'leftwidth': 400, 'rightwidth': 400, 'topheight': 400, 'bottomheight': 400, 'all_joints': [[0], [1], [2], [3]], 'all_joints_names': ['bodypart1', 'bodypart2', 'bodypart3', 'objectA'], 'dataset': 'training-datasets/iteration-0/UnaugmentedDataSet_test5Jun13/test5_randy95shuffle1.mat', 'display_iters': 1000, 'init_weights': '/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/models/pretrained/resnet_v1_50.ckpt', 'max_input_size': 1500, 'metadataset': 'training-datasets/iteration-0/UnaugmentedDataSet_test5Jun13/Documentation_data-test5_95shuffle1.pickle', 'min_input_size': 64, 'multi_step': [[0.005, 10000], [0.02, 430000], [0.002, 730000], [0.001, 1030000]], 'net_type': 'resnet_50', 'num_joints': 4, 'pos_dist_thresh': 17, 'project_path': '/home/randy/Downloads/test5-randy-2020-06-13', 'save_iters': 50000, 'scale_jitter_lo': 0.5, 'scale_jitter_up': 1.25, 'output_stride': 16, 'deconvolutionstride': 2}
Starting training....
2020-06-15 20:39:14.221424: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
---------------------------------------------------------------------------
InternalError                             Traceback (most recent call last)
/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1333     try:
-> 1334       return fn(*args)
   1335     except errors.OpError as e:

/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run_fn(feed_dict, fetch_list, target_list, options, run_metadata)
   1318       return self._call_tf_sessionrun(
-> 1319           options, feed_dict, fetch_list, target_list, run_metadata)
   1320 

/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/client/session.py in _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list, run_metadata)
   1406         self._session, options, feed_dict, fetch_list, target_list,
-> 1407         run_metadata)
   1408 

InternalError: cuDNN launch failure : input shape([1,3,746,992]) filter shape([7,7,3,64])
	 [[{{node resnet_v1_50/conv1/Conv2D}}]]
	 [[{{node sigmoid_cross_entropy_loss/value}}]]

During handling of the above exception, another exception occurred:

InternalError                             Traceback (most recent call last)
/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/gui/train_network.py in train_network(self, event)
    266                                  displayiters=displayiters,
    267                                  saveiters=saveiters,
--> 268                                  maxiters=maxiters)
    269 
    270     def cancel_train_network(self,event):

/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/training.py in train_network(config, shuffle, trainingsetindex, max_snapshots_to_keep, displayiters, saveiters, maxiters, allow_growth, gputouse, autotune, keepdeconvweights)
    132         train(str(poseconfigfile),displayiters,saveiters,maxiters,max_to_keep=max_snapshots_to_keep,keepdeconvweights=keepdeconvweights,allow_growth=allow_growth) #pass on path and file name for pose_cfg.yaml!
    133     except BaseException as e:
--> 134         raise e
    135     finally:
    136         os.chdir(str(start_path))

/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/training.py in train_network(config, shuffle, trainingsetindex, max_snapshots_to_keep, displayiters, saveiters, maxiters, allow_growth, gputouse, autotune, keepdeconvweights)
    130             os.environ['CUDA_VISIBLE_DEVICES'] = str(gputouse)
    131     try:
--> 132         train(str(poseconfigfile),displayiters,saveiters,maxiters,max_to_keep=max_snapshots_to_keep,keepdeconvweights=keepdeconvweights,allow_growth=allow_growth) #pass on path and file name for pose_cfg.yaml!
    133     except BaseException as e:
    134         raise e

/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/train.py in train(config_yaml, displayiters, saveiters, maxiters, max_to_keep, keepdeconvweights, allow_growth)
    188         current_lr = lr_gen.get_lr(it)
    189         [_, loss_val, summary] = sess.run([train_op, total_loss, merged_summaries],
--> 190                                           feed_dict={learning_rate: current_lr})
    191         cum_loss += loss_val
    192         train_writer.add_summary(summary, it)

/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
    927     try:
    928       result = self._run(None, fetches, feed_dict, options_ptr,
--> 929                          run_metadata_ptr)
    930       if run_metadata:
    931         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1150     if final_fetches or final_targets or (handle and feed_dict_tensor):
   1151       results = self._do_run(handle, final_targets, final_fetches,
-> 1152                              feed_dict_tensor, options, run_metadata)
   1153     else:
   1154       results = []

/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
   1326     if handle is None:
   1327       return self._do_call(_run_fn, feeds, fetches, targets, options,
-> 1328                            run_metadata)
   1329     else:
   1330       return self._do_call(_prun_fn, handle, feeds, fetches)

/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1346           pass
   1347       message = error_interpolation.interpolate(message, self._graph)
-> 1348       raise type(e)(node_def, op, message)
   1349 
   1350   def _extend_graph(self):

InternalError: cuDNN launch failure : input shape([1,3,746,992]) filter shape([7,7,3,64])
	 [[node resnet_v1_50/conv1/Conv2D (defined at /usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/nnet/pose_net.py:52) ]]
	 [[node sigmoid_cross_entropy_loss/value (defined at /usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/nnet/pose_net.py:162) ]]

Caused by op 'resnet_v1_50/conv1/Conv2D', defined at:
  File "/usr/local/bin/anaconda3/envs/DLC/bin/ipython", line 11, in <module>
    sys.exit(start_ipython())
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/IPython/__init__.py", line 126, in start_ipython
    return launch_new_instance(argv=argv, **kwargs)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/traitlets/config/application.py", line 664, in launch_instance
    app.start()
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/IPython/terminal/ipapp.py", line 356, in start
    self.shell.mainloop()
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/IPython/terminal/interactiveshell.py", line 558, in mainloop
    self.interact()
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/IPython/terminal/interactiveshell.py", line 549, in interact
    self.run_cell(code, store_history=True)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2858, in run_cell
    raw_cell, store_history, silent, shell_futures)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2886, in _run_cell
    return runner(coro)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/IPython/core/async_helpers.py", line 68, in _pseudo_sync_runner
    coro.send(None)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3063, in run_cell_async
    interactivity=interactivity, compiler=compiler, result=result)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3254, in run_ast_nodes
    if (await self.run_code(code, result,  async_=asy)):
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3331, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-1ced4355ffc5>", line 1, in <module>
    deeplabcut.launch_dlc()
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/gui/launch_script.py", line 45, in launch_dlc
    app.MainLoop()
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/wx/core.py", line 2166, in MainLoop
    rv = wx.PyApp.MainLoop(self)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/gui/train_network.py", line 268, in train_network
    maxiters=maxiters)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/training.py", line 132, in train_network
    train(str(poseconfigfile),displayiters,saveiters,maxiters,max_to_keep=max_snapshots_to_keep,keepdeconvweights=keepdeconvweights,allow_growth=allow_growth) #pass on path and file name for pose_cfg.yaml!
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/train.py", line 119, in train
    losses = pose_net(cfg).train(batch)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/nnet/pose_net.py", line 154, in train
    heads = self.get_net(batch[Batch.inputs])
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/nnet/pose_net.py", line 81, in get_net
    net, end_points = self.extract_features(inputs)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/nnet/pose_net.py", line 52, in extract_features
    global_pool=False, output_stride=self.cfg.output_stride,is_training=False)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/nets/resnet_v1.py", line 274, in resnet_v1_50
    scope=scope)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/nets/resnet_v1.py", line 205, in resnet_v1
    net = resnet_utils.conv2d_same(net, 64, 7, stride=2, scope='conv1')
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/nets/resnet_utils.py", line 146, in conv2d_same
    scope=scope)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 182, in func_with_args
    return func(*args, **current_args)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1155, in convolution2d
    conv_dims=2)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 182, in func_with_args
    return func(*args, **current_args)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1058, in convolution
    outputs = layer.apply(inputs)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 1227, in apply
    return self.__call__(inputs, *args, **kwargs)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 530, in __call__
    outputs = super(Layer, self).__call__(inputs, *args, **kwargs)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 554, in __call__
    outputs = self.call(inputs, *args, **kwargs)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/keras/layers/convolutional.py", line 194, in call
    outputs = self._convolution_op(inputs, self.kernel)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 966, in __call__
    return self.conv_op(inp, filter)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 591, in __call__
    return self.call(inp, filter)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 208, in __call__
    name=self.name)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 1026, in conv2d
    data_format=data_format, dilations=dilations, name=name)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
    op_def=op_def)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1801, in __init__
    self._traceback = tf_stack.extract_stack()

InternalError (see above for traceback): cuDNN launch failure : input shape([1,3,746,992]) filter shape([7,7,3,64])
	 [[node resnet_v1_50/conv1/Conv2D (defined at /usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/nnet/pose_net.py:52) ]]
	 [[node sigmoid_cross_entropy_loss/value (defined at /usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/nnet/pose_net.py:162) ]]

hi @randybruno - can you successfully run the test scripts, this seems specific to the project you are attempting to use.

If you have not run the test scripts, I would recommend this as a first step for any system.

Here is a video on how to do this: https://www.youtube.com/watch?v=IOWtKn3l33s

Please git clone the repo: ( git clone https://github.com/DeepLabCut/DeepLabCut.git )

cd DeepLabCut/examples 
python testscript.py

then:
python testscript_multianimal.py

Thanks @MWMathis! We were looking for something like those. Here’s the output of testscript.py on our GPU installation, which - other than warnings about deprecated functions - again suggests some cudnn launch failure. CPU install follows.

Imported DLC!
CREATING PROJECT
Created "/home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17/videos"
Created "/home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17/labeled-data"
Created "/home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17/training-datasets"
Created "/home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17/dlc-models"
Copying the videos
/home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17/videos/reachingvideo1.avi
Generated "/home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17/config.yaml"

A new project with name TEST-Alex-2020-06-17 is created at /home/randy/src/DeepLabCut/examples and a configurable file (config.yaml) is stored there. Change the parameters in this file to adapt to your project's needs.
 Once you have changed the configuration file, use the function 'extract_frames' to select frames for labeling.
. [OPTIONAL] Use the function 'add_new_videos' to add new videos to your project (at any stage).
EXTRACTING FRAMES
Config file read successfully.
Extracting frames based on kmeans ...
Kmeans-quantization based extracting of frames from 0.0  seconds to 8.53  seconds.
Extracting and downsampling... 256  frames from the video.
256it [00:01, 217.35it/s]
Kmeans clustering ... (this might take a while)
Frames were successfully extracted.

You can now label the frames using the function 'label_frames' (if you extracted enough frames for all videos).
CREATING-SOME LABELS FOR THE FRAMES
Plot labels...
Creating images with labels by Alex.
They are stored in the following folder: /home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17/labeled-data/reachingvideo1_labeled.
If all the labels are ok, then use the function 'create_training_dataset' to create the training dataset!
CREATING TRAININGSET
The training dataset is successfully created. Use the function 'train_network' to start training. Happy training!
CHANGING training parameters to end quickly!
TRAIN
Config:
{'all_joints': [[0], [1], [2], [3]],
 'all_joints_names': ['bodypart1', 'bodypart2', 'bodypart3', 'objectA'],
 'batch_size': 1,
 'bottomheight': 400,
 'crop': True,
 'crop_pad': 0,
 'cropratio': 0.4,
 'dataset': 'training-datasets/iteration-0/UnaugmentedDataSet_TESTJun17/TEST_Alex80shuffle1.mat',
 'dataset_type': 'default',
 'deterministic': False,
 'display_iters': 2,
 'fg_fraction': 0.25,
 'global_scale': 0.8,
 'init_weights': '/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/models/pretrained/resnet_v1_50.ckpt',
 'intermediate_supervision': False,
 'intermediate_supervision_layer': 12,
 'leftwidth': 400,
 'location_refinement': True,
 'locref_huber_loss': True,
 'locref_loss_weight': 0.05,
 'locref_stdev': 7.2801,
 'log_dir': 'log',
 'max_input_size': 1500,
 'mean_pixel': [123.68, 116.779, 103.939],
 'metadataset': 'training-datasets/iteration-0/UnaugmentedDataSet_TESTJun17/Documentation_data-TEST_80shuffle1.pickle',
 'min_input_size': 64,
 'minsize': 100,
 'mirror': False,
 'multi_step': [[0.001, 5]],
 'net_type': 'resnet_50',
 'num_joints': 4,
 'optimizer': 'sgd',
 'pos_dist_thresh': 17,
 'project_path': '/home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17',
 'regularize': False,
 'rightwidth': 400,
 'save_iters': 5,
 'scale_jitter_lo': 0.5,
 'scale_jitter_up': 1.25,
 'scoremap_dir': 'test',
 'shuffle': True,
 'snapshot_prefix': '/home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17/dlc-models/iteration-0/TESTJun17-trainset80shuffle1/train/snapshot',
 'stride': 8.0,
 'topheight': 400,
 'weigh_negatives': False,
 'weigh_only_present_joints': False,
 'weigh_part_predictions': False,
 'weight_decay': 0.0001}
Switching batchsize to 1, as default/tensorpack/deterministic loaders do not support batches >1. Use imgaug loader.
Starting with standard pose-dataset loader.
Initializing ResNet
WARNING:tensorflow:From /usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/ops/losses/losses_impl.py:209: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
Loading ImageNet-pretrained resnet_50
WARNING:tensorflow:From /usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
Training parameter:
{'stride': 8.0, 'weigh_part_predictions': False, 'weigh_negatives': False, 'fg_fraction': 0.25, 'weigh_only_present_joints': False, 'mean_pixel': [123.68, 116.779, 103.939], 'shuffle': True, 'snapshot_prefix': '/home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17/dlc-models/iteration-0/TESTJun17-trainset80shuffle1/train/snapshot', 'log_dir': 'log', 'global_scale': 0.8, 'location_refinement': True, 'locref_stdev': 7.2801, 'locref_loss_weight': 0.05, 'locref_huber_loss': True, 'optimizer': 'sgd', 'intermediate_supervision': False, 'intermediate_supervision_layer': 12, 'regularize': False, 'weight_decay': 0.0001, 'mirror': False, 'crop_pad': 0, 'scoremap_dir': 'test', 'batch_size': 1, 'dataset_type': 'default', 'deterministic': False, 'crop': True, 'cropratio': 0.4, 'minsize': 100, 'leftwidth': 400, 'rightwidth': 400, 'topheight': 400, 'bottomheight': 400, 'all_joints': [[0], [1], [2], [3]], 'all_joints_names': ['bodypart1', 'bodypart2', 'bodypart3', 'objectA'], 'dataset': 'training-datasets/iteration-0/UnaugmentedDataSet_TESTJun17/TEST_Alex80shuffle1.mat', 'display_iters': 2, 'init_weights': '/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/models/pretrained/resnet_v1_50.ckpt', 'max_input_size': 1500, 'metadataset': 'training-datasets/iteration-0/UnaugmentedDataSet_TESTJun17/Documentation_data-TEST_80shuffle1.pickle', 'min_input_size': 64, 'multi_step': [[0.001, 5]], 'net_type': 'resnet_50', 'num_joints': 4, 'pos_dist_thresh': 17, 'project_path': '/home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17', 'save_iters': 5, 'scale_jitter_lo': 0.5, 'scale_jitter_up': 1.25, 'output_stride': 16, 'deconvolutionstride': 2}
Starting training....
2020-06-17 14:52:03.424084: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-06-17 14:52:03.425794: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
Traceback (most recent call last):
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
    return fn(*args)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[{{node resnet_v1_50/conv1/Conv2D}}]]
	 [[{{node sigmoid_cross_entropy_loss/value}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "testscript.py", line 136, in <module>
    deeplabcut.train_network(path_config_file)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/training.py", line 134, in train_network
    raise e
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/training.py", line 132, in train_network
    train(str(poseconfigfile),displayiters,saveiters,maxiters,max_to_keep=max_snapshots_to_keep,keepdeconvweights=keepdeconvweights,allow_growth=allow_growth) #pass on path and file name for pose_cfg.yaml!
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/train.py", line 190, in train
    feed_dict={learning_rate: current_lr})
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node resnet_v1_50/conv1/Conv2D (defined at /usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/nnet/pose_net.py:52) ]]
	 [[node sigmoid_cross_entropy_loss/value (defined at /usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/nnet/pose_net.py:162) ]]

Caused by op 'resnet_v1_50/conv1/Conv2D', defined at:
  File "testscript.py", line 136, in <module>
    deeplabcut.train_network(path_config_file)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/training.py", line 132, in train_network
    train(str(poseconfigfile),displayiters,saveiters,maxiters,max_to_keep=max_snapshots_to_keep,keepdeconvweights=keepdeconvweights,allow_growth=allow_growth) #pass on path and file name for pose_cfg.yaml!
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/train.py", line 119, in train
    losses = pose_net(cfg).train(batch)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/nnet/pose_net.py", line 154, in train
    heads = self.get_net(batch[Batch.inputs])
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/nnet/pose_net.py", line 81, in get_net
    net, end_points = self.extract_features(inputs)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/nnet/pose_net.py", line 52, in extract_features
    global_pool=False, output_stride=self.cfg.output_stride,is_training=False)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/nets/resnet_v1.py", line 274, in resnet_v1_50
    scope=scope)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/nets/resnet_v1.py", line 205, in resnet_v1
    net = resnet_utils.conv2d_same(net, 64, 7, stride=2, scope='conv1')
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/nets/resnet_utils.py", line 146, in conv2d_same
    scope=scope)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 182, in func_with_args
    return func(*args, **current_args)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1155, in convolution2d
    conv_dims=2)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 182, in func_with_args
    return func(*args, **current_args)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1058, in convolution
    outputs = layer.apply(inputs)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 1227, in apply
    return self.__call__(inputs, *args, **kwargs)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 530, in __call__
    outputs = super(Layer, self).__call__(inputs, *args, **kwargs)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 554, in __call__
    outputs = self.call(inputs, *args, **kwargs)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/keras/layers/convolutional.py", line 194, in call
    outputs = self._convolution_op(inputs, self.kernel)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 966, in __call__
    return self.conv_op(inp, filter)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 591, in __call__
    return self.call(inp, filter)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 208, in __call__
    name=self.name)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 1026, in conv2d
    data_format=data_format, dilations=dilations, name=name)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
    op_def=op_def)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1801, in __init__
    self._traceback = tf_stack.extract_stack()

UnknownError (see above for traceback): Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node resnet_v1_50/conv1/Conv2D (defined at /usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/nnet/pose_net.py:52) ]]
	 [[node sigmoid_cross_entropy_loss/value (defined at /usr/local/bin/anaconda3/envs/DLC/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/nnet/pose_net.py:162) ]]
1 Like

Here’s a run of the CPU version. We seem to hit some snag with tensorflow and ffmpeg during cropping. Trimmed text marked with elipses.

...
Starting with standard pose-dataset loader.
Initializing ResNet
WARNING:tensorflow:From /usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/ops/losses/losses_impl.py:209: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
Loading ImageNet-pretrained resnet_50
WARNING:tensorflow:From /usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
Training parameter:
{'stride': 8.0, 'weigh_part_predictions': False, 'weigh_negatives': False, 'fg_fraction': 0.25, 'weigh_only_present_joints': False, 'mean_pixel': [123.68, 116.779, 103.939], 'shuffle': True, 'snapshot_prefix': '/home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17/dlc-models/iteration-0/TESTJun17-trainset80shuffle1/train/snapshot', 'log_dir': 'log', 'global_scale': 0.8, 'location_refinement': True, 'locref_stdev': 7.2801, 'locref_loss_weight': 0.05, 'locref_huber_loss': True, 'optimizer': 'sgd', 'intermediate_supervision': False, 'intermediate_supervision_layer': 12, 'regularize': False, 'weight_decay': 0.0001, 'mirror': False, 'crop_pad': 0, 'scoremap_dir': 'test', 'batch_size': 1, 'dataset_type': 'default', 'deterministic': False, 'crop': True, 'cropratio': 0.4, 'minsize': 100, 'leftwidth': 400, 'rightwidth': 400, 'topheight': 400, 'bottomheight': 400, 'all_joints': [[0], [1], [2], [3]], 'all_joints_names': ['bodypart1', 'bodypart2', 'bodypart3', 'objectA'], 'dataset': 'training-datasets/iteration-0/UnaugmentedDataSet_TESTJun17/TEST_Alex80shuffle1.mat', 'display_iters': 2, 'init_weights': '/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/models/pretrained/resnet_v1_50.ckpt', 'max_input_size': 1500, 'metadataset': 'training-datasets/iteration-0/UnaugmentedDataSet_TESTJun17/Documentation_data-TEST_80shuffle1.pickle', 'min_input_size': 64, 'multi_step': [[0.001, 5]], 'net_type': 'resnet_50', 'num_joints': 4, 'pos_dist_thresh': 17, 'project_path': '/home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17', 'save_iters': 5, 'scale_jitter_lo': 0.5, 'scale_jitter_up': 1.25, 'output_stride': 16, 'deconvolutionstride': 2}
Starting training....
iteration: 2 loss: 1.2054 lr: 0.001
iteration: 4 loss: 0.6511 lr: 0.001
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
    return fn(*args)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.CancelledError: Enqueue operation was cancelled
	 [[{{node fifo_queue_enqueue}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/train.py", line 81, in load_and_enqueue
    sess.run(enqueue_op, feed_dict=food)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.CancelledError: Enqueue operation was cancelled
	 [[node fifo_queue_enqueue (defined at /usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/train.py:67) ]]

Caused by op 'fifo_queue_enqueue', defined at:
  File "testscript.py", line 136, in <module>
    deeplabcut.train_network(path_config_file)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/training.py", line 132, in train_network
    train(str(poseconfigfile),displayiters,saveiters,maxiters,max_to_keep=max_snapshots_to_keep,keepdeconvweights=keepdeconvweights,allow_growth=allow_growth) #pass on path and file name for pose_cfg.yaml!
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/train.py", line 118, in train
    batch, enqueue_op, placeholders = setup_preloading(batch_spec)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/train.py", line 67, in setup_preloading
    enqueue_op = q.enqueue(placeholders_list)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/ops/data_flow_ops.py", line 345, in enqueue
    self._queue_ref, vals, name=scope)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 4158, in queue_enqueue_v2
    timeout_ms=timeout_ms, name=name)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
    op_def=op_def)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1801, in __init__
    self._traceback = tf_stack.extract_stack()

CancelledError (see above for traceback): Enqueue operation was cancelled
	 [[node fifo_queue_enqueue (defined at /usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/train.py:67) ]]


The network is now trained and ready to evaluate. Use the function 'evaluate_network' to evaluate the network.
...
The network is evaluated and the results are stored in the subdirectory 'evaluation_results'.
If it generalizes well, choose the best model for prediction and update the config file with the appropriate index for the 'snapshotindex'.
Use the function 'analyze_video' to make predictions on new videos.
Otherwise consider retraining the network (see DeepLabCut workflow Fig 2)
CUT SHORT VIDEO AND ANALYZE (with dynamic cropping!)
Slicing and saving to name /home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17/videos/reachingvideo1short.avi
ffmpeg version 4.1.4-1~deb10u1 Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 8 (Debian 8.3.0-6)
  configuration: --prefix=/usr --extra-version='1~deb10u1' --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 22.100 / 56. 22.100
  libavcodec     58. 35.100 / 58. 35.100
  libavformat    58. 20.100 / 58. 20.100
  libavdevice    58.  5.100 / 58.  5.100
  libavfilter     7. 40.101 /  7. 40.101
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  3.100 /  5.  3.100
  libswresample   3.  3.100 /  3.  3.100
  libpostproc    55.  3.100 / 55.  3.100
Input #0, avi, from '/home/randy/src/DeepLabCut/examples/Reaching-Mackenzie-2018-08-30/videos/reachingvideo1.avi':
  Duration: 00:00:08.53, start: 0.000000, bitrate: 12642 kb/s
    Stream #0:0: Video: mjpeg (MJPG / 0x47504A4D), yuvj420p(pc, bt470bg/unknown/unknown), 832x747 [SAR 1:1 DAR 832:747], 12682 kb/s, 30 fps, 30 tbr, 30 tbn, 30 tbc
    Metadata:
      title           : ImageJ AVI     
Stream mapping:
  Stream #0:0 -> #0:0 (mjpeg (native) -> mpeg4 (native))
Press [q] to stop, [?] for help
[swscaler @ 0x55ca5d3d8400] deprecated pixel format used, make sure you did set range correctly
Output #0, avi, to '/home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17/videos/reachingvideo1short.avi':
  Metadata:
    ISFT            : Lavf58.20.100
    Stream #0:0: Video: mpeg4 (FMP4 / 0x34504D46), yuv420p, 832x747 [SAR 1:1 DAR 832:747], q=2-31, 200 kb/s, 30 fps, 30 tbn, 30 tbc
    Metadata:
      title           : ImageJ AVI     
      encoder         : Lavc58.35.100 mpeg4
    Side data:
      cpb: bitrate max/min/avg: 0/0/200000 buffer size: 0 vbv_delay: -1
frame=   12 fps=0.0 q=24.4 Lsize=     195kB time=00:00:00.40 bitrate=3994.6kbits/s speed=3.89x    
video:189kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 3.140218%
Config:
...
CREATE VIDEO
Starting %  /home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17/videos ['/home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17/videos/reachingvideo1short.avi']
Loading  /home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17/videos/reachingvideo1short.avi and data.
12
Duration of video [s]:  0.4 , recorded with  30.0 fps!
Overall # of frames:  12 with cropped frame dimensions:  832 747
Generating frames and creating video.
100%|███████████████████████████████████████████| 12/12 [00:05<00:00,  2.29it/s]
All labeled frames were created, now generating video...
ffmpeg version 4.1.4-1~deb10u1 Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 8 (Debian 8.3.0-6)
  configuration: --prefix=/usr --extra-version='1~deb10u1' --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 22.100 / 56. 22.100
  libavcodec     58. 35.100 / 58. 35.100
  libavformat    58. 20.100 / 58. 20.100
  libavdevice    58.  5.100 / 58.  5.100
  libavfilter     7. 40.101 /  7. 40.101
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  3.100 /  5.  3.100
  libswresample   3.  3.100 /  3.  3.100
  libpostproc    55.  3.100 / 55.  3.100
Input #0, image2, from 'file%02d.png':
  Duration: 00:00:00.40, start: 0.000000, bitrate: N/A
    Stream #0:0: Video: png, rgba(pc), 832x747 [SAR 3937:3937 DAR 832:747], 30 fps, 30 tbr, 30 tbn, 30 tbc
Stream mapping:
  Stream #0:0 -> #0:0 (png (native) -> h264 (libx264))
Press [q] to stop, [?] for help
[libx264 @ 0x555a5abe99c0] using SAR=1/1
[libx264 @ 0x555a5abe99c0] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 AVX512
[libx264 @ 0x555a5abe99c0] profile High 4:4:4 Predictive, level 3.1, 4:4:4 8-bit
[libx264 @ 0x555a5abe99c0] 264 - core 155 r2917 0a84d98 - H.264/MPEG-4 AVC codec - Copyleft 2003-2018 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x1:0x111 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=0 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=4 threads=23 lookahead_threads=3 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to '../reachingvideo1shortDLC_resnet50_TESTJun17shuffle1_5_labeled.mp4':
  Metadata:
    encoder         : Lavf58.20.100
    Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661), yuv444p, 832x747 [SAR 1:1 DAR 832:747], q=-1--1, 30 fps, 15360 tbn, 30 tbc
    Metadata:
      encoder         : Lavc58.35.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
frame=   12 fps=0.0 q=-1.0 Lsize=      75kB time=00:00:00.30 bitrate=2053.4kbits/s speed=1.01x    
video:74kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 1.247519%
[libx264 @ 0x555a5abe99c0] frame I:1     Avg QP:21.60  size: 18577
[libx264 @ 0x555a5abe99c0] frame P:8     Avg QP:23.49  size:  5891
[libx264 @ 0x555a5abe99c0] frame B:3     Avg QP:26.00  size:  3226
[libx264 @ 0x555a5abe99c0] consecutive B-frames: 66.7%  0.0%  0.0% 33.3%
[libx264 @ 0x555a5abe99c0] mb I  I16..4: 66.9%  0.0% 33.1%
[libx264 @ 0x555a5abe99c0] mb P  I16..4:  8.4%  0.0%  4.1%  P16..4: 26.0%  6.2%  2.5%  0.0%  0.0%    skip:52.8%
[libx264 @ 0x555a5abe99c0] mb B  I16..4:  1.1%  0.0%  0.8%  B16..8: 43.7%  3.3%  0.6%  direct: 1.1%  skip:49.4%  L0:39.6% L1:58.2% BI: 2.2%
[libx264 @ 0x555a5abe99c0] coded y,u,v intra: 30.3% 0.8% 0.8% inter: 8.8% 0.1% 0.1%
[libx264 @ 0x555a5abe99c0] i16 v,h,dc,p: 54% 19% 14% 14%
[libx264 @ 0x555a5abe99c0] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 31% 16% 24%  4%  6%  8%  5%  4%  3%
[libx264 @ 0x555a5abe99c0] Weighted P-Frames: Y:12.5% UV:12.5%
[libx264 @ 0x555a5abe99c0] ref P L0: 77.9% 10.4%  9.1%  2.7%
[libx264 @ 0x555a5abe99c0] ref B L0: 93.8%  6.2%
[libx264 @ 0x555a5abe99c0] ref B L1: 91.3%  8.7%
[libx264 @ 0x555a5abe99c0] kb/s:1507.64
Making plots
/home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17/videos/reachingvideo1short.avi
Starting %  /home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17/videos /home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17/videos/reachingvideo1short.avi
Loading  /home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17/videos/reachingvideo1short.avi and data.
/home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17/videos  already exists!
Plots created! Please check the directory "plot-poses" within the video directory
EXTRACT OUTLIERS
Method  jump  found  11  putative outlier frames.
Do you want to proceed with extracting  5  of those?
If this list is very large, perhaps consider changing the paramters (start, stop, epsilon, comparisonbodyparts) or use a different method.
Loading video...
Duration of video [s]:  0.4 , recorded @  30.0 fps!
Overall # of frames:  12 with (cropped) frame dimensions: 
Kmeans-quantization based extracting of frames from 0.0  seconds to 0.4  seconds.
Extracting and downsampling... 11  frames from the video.
11it [00:00, 122.09it/s]
Kmeans clustering ... (this might take a while)
Let's select frames indices: [1, 9, 2, 7, 5]
New video was added to the project! Use the function 'extract_frames' to select frames for labeling.
The outlier frames are extracted. They are stored in the subdirectory labeled-data\reachingvideo1short.
Once you extracted frames for all videos, use 'refine_labels' to manually correct the labels.
Method  Fitting  found  0  putative outlier frames.
Do you want to proceed with extracting  5  of those?
Frames from video reachingvideo1short  already extracted (more will be added)!
Loading video...
Duration of video [s]:  0.4 , recorded @  30.0 fps!
Overall # of frames:  12 with (cropped) frame dimensions: 
Kmeans-quantization based extracting of frames from 0.0  seconds to 0.4  seconds.
Let's select frames indices: []
No frames were extracted.
RELABELING
MERGING
Merged data sets and updated refinement iteration to 1.
Now you can create a new training set for the expanded annotated images (use create_training_dataset).
CREATING TRAININGSET
The training dataset is successfully created. Use the function 'train_network' to start training. Happy training!
CHANGING training parameters to end quickly!
TRAIN
Config:
{'Task': None,
 'TrainingFraction': None,
 'all_joints': [[0], [1], [2], [3]],
 'all_joints_names': ['bodypart1', 'bodypart2', 'bodypart3', 'objectA'],
 'alphavalue': None,
 'batch_size': 1,
 'bodyparts': None,
 'bottomheight': 400,
 'colormap': None,
 'corner2move2': None,
 'crop': True,
 'crop_pad': 0,
 'cropping': None,
 'cropratio': 0.4,
 'dataset': 'training-datasets/iteration-1/UnaugmentedDataSet_TESTJun17/TEST_Alex80shuffle1.mat',
 'dataset_type': 'imgaug',
 'date': None,
 'deconvolutionstride': 2,
 'deterministic': False,
 'display_iters': 1,
 'dotsize': None,
 'fg_fraction': 0.25,
 'global_scale': 0.8,
 'init_weights': '/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/models/pretrained/resnet_v1_50.ckpt',
 'intermediate_supervision': False,
 'intermediate_supervision_layer': 12,
 'iteration': None,
 'leftwidth': 400,
 'location_refinement': True,
 'locref_huber_loss': True,
 'locref_loss_weight': 0.05,
 'locref_stdev': 7.2801,
 'log_dir': 'log',
 'max_input_size': 1500,
 'mean_pixel': [123.68, 116.779, 103.939],
 'metadataset': 'training-datasets/iteration-1/UnaugmentedDataSet_TESTJun17/Documentation_data-TEST_80shuffle1.pickle',
 'min_input_size': 64,
 'minsize': 100,
 'mirror': False,
 'move2corner': None,
 'multi_step': [[0.001, 5]],
 'net_type': 'resnet_50',
 'num_joints': 4,
 'num_outputs': 1,
 'numframes2pick': None,
 'optimizer': 'sgd',
 'output_stride': 16,
 'pcutoff': None,
 'pos_dist_thresh': 17,
 'project_path': '/home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17',
 'regularize': False,
 'resnet': None,
 'rightwidth': 400,
 'save_iters': 5,
 'scale_jitter_lo': 0.5,
 'scale_jitter_up': 1.25,
 'scoremap_dir': 'test',
 'scorer': None,
 'shuffle': True,
 'skeleton': [],
 'skeleton_color': 'black',
 'snapshot_prefix': '/home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17/dlc-models/iteration-1/TESTJun17-trainset80shuffle1/train/snapshot',
 'snapshotindex': None,
 'start': None,
 'stop': None,
 'stride': 8.0,
 'topheight': 400,
 'video_sets': None,
 'weigh_negatives': False,
 'weigh_only_present_joints': False,
 'weigh_part_predictions': False,
 'weight_decay': 0.0001,
 'x1': None,
 'x2': None,
 'y1': None,
 'y2': None}
Starting with imgaug pose-dataset loader.
Batch Size is 1
Initializing ResNet
Loading ImageNet-pretrained resnet_50
Training parameter:
{'stride': 8.0, 'weigh_part_predictions': False, 'weigh_negatives': False, 'fg_fraction': 0.25, 'weigh_only_present_joints': False, 'mean_pixel': [123.68, 116.779, 103.939], 'shuffle': True, 'snapshot_prefix': '/home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17/dlc-models/iteration-1/TESTJun17-trainset80shuffle1/train/snapshot', 'log_dir': 'log', 'global_scale': 0.8, 'location_refinement': True, 'locref_stdev': 7.2801, 'locref_loss_weight': 0.05, 'locref_huber_loss': True, 'optimizer': 'sgd', 'intermediate_supervision': False, 'intermediate_supervision_layer': 12, 'regularize': False, 'weight_decay': 0.0001, 'mirror': False, 'crop_pad': 0, 'scoremap_dir': 'test', 'batch_size': 1, 'dataset_type': 'imgaug', 'deterministic': False, 'crop': True, 'cropratio': 0.4, 'minsize': 100, 'leftwidth': 400, 'rightwidth': 400, 'topheight': 400, 'bottomheight': 400, 'all_joints': [[0], [1], [2], [3]], 'all_joints_names': ['bodypart1', 'bodypart2', 'bodypart3', 'objectA'], 'dataset': 'training-datasets/iteration-1/UnaugmentedDataSet_TESTJun17/TEST_Alex80shuffle1.mat', 'display_iters': 1, 'init_weights': '/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/models/pretrained/resnet_v1_50.ckpt', 'max_input_size': 1500, 'metadataset': 'training-datasets/iteration-1/UnaugmentedDataSet_TESTJun17/Documentation_data-TEST_80shuffle1.pickle', 'min_input_size': 64, 'multi_step': [[0.001, 5]], 'net_type': 'resnet_50', 'num_joints': 4, 'pos_dist_thresh': 17, 'project_path': '/home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17', 'save_iters': 5, 'scale_jitter_lo': 0.5, 'scale_jitter_up': 1.25, 'output_stride': 16, 'deconvolutionstride': 2, 'num_outputs': 1, 'Task': None, 'scorer': None, 'date': None, 'video_sets': None, 'bodyparts': None, 'start': None, 'stop': None, 'numframes2pick': None, 'skeleton': [], 'skeleton_color': 'black', 'pcutoff': None, 'dotsize': None, 'alphavalue': None, 'colormap': None, 'TrainingFraction': None, 'iteration': None, 'resnet': None, 'snapshotindex': None, 'cropping': None, 'x1': None, 'x2': None, 'y1': None, 'y2': None, 'corner2move2': None, 'move2corner': None}
Starting training....
iteration: 1 loss: 1.7106 lr: 0.001
iteration: 2 loss: 0.7449 lr: 0.001
iteration: 3 loss: 0.6459 lr: 0.001
iteration: 4 loss: 0.5601 lr: 0.001
iteration: 5 loss: 0.5124 lr: 0.001
Exception in thread Thread-9:
Traceback (most recent call last):
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
    return fn(*args)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.CancelledError: Enqueue operation was cancelled
	 [[{{node fifo_queue_enqueue}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/train.py", line 81, in load_and_enqueue
    sess.run(enqueue_op, feed_dict=food)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.CancelledError: Enqueue operation was cancelled
	 [[node fifo_queue_enqueue (defined at /usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/train.py:67) ]]

Caused by op 'fifo_queue_enqueue', defined at:
  File "testscript.py", line 265, in <module>
    deeplabcut.train_network(path_config_file)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/training.py", line 132, in train_network
    train(str(poseconfigfile),displayiters,saveiters,maxiters,max_to_keep=max_snapshots_to_keep,keepdeconvweights=keepdeconvweights,allow_growth=allow_growth) #pass on path and file name for pose_cfg.yaml!
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/train.py", line 118, in train
    batch, enqueue_op, placeholders = setup_preloading(batch_spec)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/train.py", line 67, in setup_preloading
    enqueue_op = q.enqueue(placeholders_list)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/ops/data_flow_ops.py", line 345, in enqueue
    self._queue_ref, vals, name=scope)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 4158, in queue_enqueue_v2
    timeout_ms=timeout_ms, name=name)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
    op_def=op_def)
  File "/usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1801, in __init__
    self._traceback = tf_stack.extract_stack()

CancelledError (see above for traceback): Enqueue operation was cancelled
	 [[node fifo_queue_enqueue (defined at /usr/local/bin/anaconda3/envs/DLC-CPU2/lib/python3.6/site-packages/deeplabcut/pose_estimation_tensorflow/train.py:67) ]]


The network is now trained and ready to evaluate. Use the function 'evaluate_network' to evaluate the network.
Slicing and saving to name /home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17/videos/reachingvideo1short2.avi
ffmpeg version 4.1.4-1~deb10u1 Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 8 (Debian 8.3.0-6)
  configuration: --prefix=/usr --extra-version='1~deb10u1' --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 22.100 / 56. 22.100
  libavcodec     58. 35.100 / 58. 35.100
  libavformat    58. 20.100 / 58. 20.100
  libavdevice    58.  5.100 / 58.  5.100
  libavfilter     7. 40.101 /  7. 40.101
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  3.100 /  5.  3.100
  libswresample   3.  3.100 /  3.  3.100
  libpostproc    55.  3.100 / 55.  3.100
Input #0, avi, from '/home/randy/src/DeepLabCut/examples/Reaching-Mackenzie-2018-08-30/videos/reachingvideo1.avi':
  Duration: 00:00:08.53, start: 0.000000, bitrate: 12642 kb/s
    Stream #0:0: Video: mjpeg (MJPG / 0x47504A4D), yuvj420p(pc, bt470bg/unknown/unknown), 832x747 [SAR 1:1 DAR 832:747], 12682 kb/s, 30 fps, 30 tbr, 30 tbn, 30 tbc
    Metadata:
      title           : ImageJ AVI     
Stream mapping:
  Stream #0:0 -> #0:0 (mjpeg (native) -> mpeg4 (native))
Press [q] to stop, [?] for help
[swscaler @ 0x55741c75e2c0] deprecated pixel format used, make sure you did set range correctly
Output #0, avi, to '/home/randy/src/DeepLabCut/examples/TEST-Alex-2020-06-17/videos/reachingvideo1short2.avi':
  Metadata:
    ISFT            : Lavf58.20.100
    Stream #0:0: Video: mpeg4 (FMP4 / 0x34504D46), yuv420p, 832x747 [SAR 1:1 DAR 832:747], q=2-31, 200 kb/s, 30 fps, 30 tbn, 30 tbc
    Metadata:
      title           : ImageJ AVI     
      encoder         : Lavc58.35.100 mpeg4
    Side data:
      cpb: bitrate max/min/avg: 0/0/200000 buffer size: 0 vbv_delay: -1
frame=   12 fps=0.0 q=24.4 Lsize=     195kB time=00:00:00.40 bitrate=3994.6kbits/s speed= 4.2x    
video:189kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 3.140218%
Inference with direct cropping
Traceback (most recent call last):
  File "testscript.py", line 299, in <module>
    cropping=[0, 50, 0, 50],
TypeError: analyze_videos() got an unexpected keyword argument 'cropping'

Installed packages for CPU run:

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
absl-py                   0.9.0                    pypi_0    pypi
astor                     0.8.1                    pypi_0    pypi
backcall                  0.2.0                      py_0  
ca-certificates           2020.1.1                      0    anaconda
cairo                     1.14.12              h8948797_3    anaconda
certifi                   2020.4.5.2               py36_0    anaconda
chardet                   3.0.4                    pypi_0    pypi
click                     7.1.2                    pypi_0    pypi
cycler                    0.10.0                   pypi_0    pypi
decorator                 4.4.2                      py_0  
deeplabcut                2.1.8.2                  pypi_0    pypi
easydict                  1.9                      pypi_0    pypi
expat                     2.2.6                he6710b0_0    anaconda
fontconfig                2.13.0               h9420a91_0    anaconda
freetype                  2.9.1                h8a8886c_1    anaconda
fribidi                   1.0.5                h7b6447c_0    anaconda
gast                      0.3.3                    pypi_0    pypi
gettext                   0.19.8.1             h9b4dc7a_1    anaconda
glib                      2.56.2               hd408876_0    anaconda
graphite2                 1.3.13               h23475e2_0    anaconda
grpcio                    1.29.0                   pypi_0    pypi
gst-plugins-base          1.14.0               hbbd80ab_1    anaconda
gstreamer                 1.14.0               hb453b48_1    anaconda
h5py                      2.10.0                   pypi_0    pypi
harfbuzz                  1.8.8                hffaf4a1_0    anaconda
icu                       58.2                 he6710b0_3    anaconda
idna                      2.9                      pypi_0    pypi
imageio                   2.8.0                    pypi_0    pypi
imageio-ffmpeg            0.4.2                    pypi_0    pypi
imgaug                    0.4.0                    pypi_0    pypi
importlib-metadata        1.6.1                    pypi_0    pypi
intel-openmp              2020.0.133               pypi_0    pypi
ipython                   7.15.0                   py36_0  
ipython_genutils          0.2.0                    py36_0  
jedi                      0.17.0                   py36_0  
joblib                    0.15.1                   pypi_0    pypi
jpeg                      9b                   habf39ab_1    anaconda
keras-applications        1.0.8                    pypi_0    pypi
keras-preprocessing       1.1.2                    pypi_0    pypi
kiwisolver                1.2.0                    pypi_0    pypi
ld_impl_linux-64          2.33.1               h53a641e_7  
libedit                   3.1.20181209         hc058e9b_0  
libffi                    3.3                  he6710b0_1  
libgcc-ng                 9.1.0                hdf63c60_0  
libglu                    9.0.0                hf484d3e_1    anaconda
libpng                    1.6.37               hbc83047_0    anaconda
libstdcxx-ng              9.1.0                hdf63c60_0  
libuuid                   1.0.3                h1bed415_2    anaconda
libxcb                    1.13                 h1bed415_1    anaconda
libxml2                   2.9.10               he19cac6_1    anaconda
markdown                  3.2.2                    pypi_0    pypi
matplotlib                3.0.3                    pypi_0    pypi
mock                      4.0.2                    pypi_0    pypi
moviepy                   1.0.1                    pypi_0    pypi
msgpack                   1.0.0                    pypi_0    pypi
msgpack-numpy             0.4.6.post0              pypi_0    pypi
ncurses                   6.2                  he6710b0_1  
networkx                  2.4                      pypi_0    pypi
numexpr                   2.7.1                    pypi_0    pypi
numpy                     1.16.4                   pypi_0    pypi
opencv-python             3.4.9.33                 pypi_0    pypi
openssl                   1.1.1g               h7b6447c_0    anaconda
pandas                    1.0.4                    pypi_0    pypi
pango                     1.42.4               h049681c_0    anaconda
parso                     0.7.0                      py_0  
patsy                     0.5.1                    pypi_0    pypi
pcre                      8.43                 he6710b0_0    anaconda
pexpect                   4.8.0                    py36_0  
pickleshare               0.7.5                    py36_0  
pillow                    7.1.2                    pypi_0    pypi
pip                       20.1.1                   py36_1  
pixman                    0.38.0               h7b6447c_0    anaconda
proglog                   0.1.9                    pypi_0    pypi
prompt-toolkit            3.0.5                      py_0  
protobuf                  3.12.2                   pypi_0    pypi
psutil                    5.7.0                    pypi_0    pypi
ptyprocess                0.6.0                    py36_0  
pygments                  2.6.1                      py_0  
pyparsing                 2.4.7                    pypi_0    pypi
python                    3.6.10               h7579374_2  
python-dateutil           2.8.1                    pypi_0    pypi
pytz                      2020.1                   pypi_0    pypi
pywavelets                1.1.1                    pypi_0    pypi
pyyaml                    5.3.1                    pypi_0    pypi
pyzmq                     19.0.1                   pypi_0    pypi
readline                  8.0                  h7b6447c_0  
requests                  2.24.0                   pypi_0    pypi
ruamel-yaml               0.16.10                  pypi_0    pypi
ruamel-yaml-clib          0.2.0                    pypi_0    pypi
scikit-image              0.17.2                   pypi_0    pypi
scikit-learn              0.23.1                   pypi_0    pypi
scipy                     1.4.1                    pypi_0    pypi
setuptools                47.3.0                   py36_0  
shapely                   1.7.0                    pypi_0    pypi
six                       1.15.0                     py_0  
sqlite                    3.31.1               h62c20be_1  
statsmodels               0.11.1                   pypi_0    pypi
tables                    3.6.1                    pypi_0    pypi
tabulate                  0.8.7                    pypi_0    pypi
tensorboard               1.13.1                   pypi_0    pypi
tensorflow                1.13.2                   pypi_0    pypi
tensorflow-estimator      1.13.0                   pypi_0    pypi
tensorpack                0.10.1                   pypi_0    pypi
termcolor                 1.1.0                    pypi_0    pypi
threadpoolctl             2.1.0                    pypi_0    pypi
tifffile                  2020.6.3                 pypi_0    pypi
tk                        8.6.8                hbc83047_0  
tqdm                      4.46.1                   pypi_0    pypi
traitlets                 4.3.3                    py36_0  
urllib3                   1.25.9                   pypi_0    pypi
wcwidth                   0.2.4                      py_0  
werkzeug                  1.0.1                    pypi_0    pypi
wheel                     0.34.2                   py36_0  
wxpython                  4.0.4            py36hc99224d_0    anaconda
xz                        5.2.5                h7b6447c_0  
zipp                      3.1.0                    pypi_0    pypi
zlib                      1.2.11               h7b6447c_3 

Great. Alright, for both ffmpeg should be installed (typically it’s installed as it’s a dependency of other packages, but if not, install ffmpeg please.

The cuDNN error is tensorflow/cuda/etc, so that is really hard to debug (see the massive lists of results if you search tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

I would check nvidia-smi and be sure you have packages that work together; here is my install guide for ubuntu, if helpful. https://github.com/DeepLabCut/Docker4DeepLabCut2.0/wiki/Installation-of-NVIDIA-driver-and-CUDA-10

I also have never used TF 1.13.2; I would perhaps try a conda install tensorflow 1.13.1 as well. Is there a reason you don’t use anaconda? I would definitely recommend this, especially for TF (or of course the best is docker if you know how to use this)

Thanks @MWMathis ! We are running anaconda3, which we were already using for other environments. Is there something else you meant by your anaconda question that I’m missing?

Your Ubuntu guide shows a download of CUDA 10.1, but the DLC github site states that CUDA 10.1 and 10.2 are incompatible with DLC. Are they safe to use now? 10.2 is the easiest for us to have installed at the OS level.

Hey there, I was just commenting that it looks like TF was installed via pypi vs. conda forge; my own experience has been a little touch and go on that. You might try just a fresh env then (if you already have CUDA 10 installed; 10.1 is okay for TF 1.13.1 and 1.14 (only things I have tried).

I would go for:

conda create -n DLC-TF python=3.7 tensorflow-gpu=1.13.1

(install this, then activate, then):

pip install deeplabcut==2.2b6

are you using the GUIs? or running this headless?

(if GUI, then you need to of course install wxPython with get the correct wheel for debian)

OIC. We had seen you using pip to install tensorflow in days gone by (perhaps a paper?) and thought it best to replicate. We’ll switch to conda forge. CUDA 10.1 is fine. We’ll try–thanks!

Yes, we’re using the GUIs and wxpython. That part has not troubled us.

1 Like

It is generally fine, but since you’re having some issues, I would just recommend this route to troubleshoot, good luck! :smiley:

OK, bringing up the OS installation of CUDA from 10.0 to 10.1 was the first time DLC succeeded in engaging the GPU - probably a fundamental card/driver/CUDA issue - BUT there were errors trying to launch cudnn. Tested various combinations of versions of the environment’s CUDA package, TF, and cudnn. TF 1.13.1 with cudnn 7.3.1 finally got the GPU running, and testscript.py seemed to run fully. This is the furthest we’ve gotten. We’re trying to run some real videos now. Many thanks! :smiley:

There are still warnings about deprecated numpy functions. Is there a preferred version? Presently numpy 1.16.4, numpy-base 1.18.1.

This also caught my eye:

Starting with tensorpack pose-dataset loader.
[0617 23:29:35 @develop.py:109] WRN [Deprecated] GaussianBlur(max_size=) will be deprecated after 01 Sep. Use size_range= instead!
[0617 23:29:35 @parallel.py:339] [MultiProcessRunnerZMQ] Will fork a dataflow more than one times. This assumes the datapoints are i.i.d.
[0617 23:29:35 @argtools.py:138] WRN Starting a process with 'fork' method is not safe and may consume unnecessary extra CPU memory. Use 'forkserver' or 'spawn' method (available after Py3.4) instead if you run into any issues. See https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods on how to set them.
[0617 23:29:35 @argtools.py:138] WRN "import prctl" failed! Install python-prctl so that processes can be cleaned with guarantee.

And this CancelledError also happens more than once:

iteration: 10 loss: 0.2026 lr: 0.001
2020-06-17 23:29:59.831929: W tensorflow/core/kernels/queue_base.cc:277] _2_fifo_queue: Skipping cancelled enqueue attempt with queue not closed
Exception in thread Thread-14:
Traceback (most recent call last):
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
    return fn(*args)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.CancelledError: Enqueue operation was cancelled
	 [[{{node fifo_queue_enqueue}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/deeplabcut/pose_estimation_tensorflow/train.py", line 91, in load_and_enqueue
    sess.run(enqueue_op, feed_dict=food)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.CancelledError: Enqueue operation was cancelled
	 [[node fifo_queue_enqueue (defined at /usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/deeplabcut/pose_estimation_tensorflow/train.py:77) ]]

Caused by op 'fifo_queue_enqueue', defined at:
  File "/usr/local/bin/anaconda3/envs/DLC/bin/ipython", line 8, in <module>
    sys.exit(start_ipython())
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/IPython/__init__.py", line 126, in start_ipython
    return launch_new_instance(argv=argv, **kwargs)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/traitlets/config/application.py", line 663, in launch_instance
    app.initialize(argv)
  File "<decorator-gen-113>", line 2, in initialize
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/traitlets/config/application.py", line 87, in catch_config_error
    return method(app, *args, **kwargs)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/IPython/terminal/ipapp.py", line 323, in initialize
    self.init_code()
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/IPython/core/shellapp.py", line 300, in init_code
    self._run_cmd_line_code()
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/IPython/core/shellapp.py", line 424, in _run_cmd_line_code
    self._exec_file(fname, shell_futures=True)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/IPython/core/shellapp.py", line 352, in _exec_file
    raise_exceptions=True)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 2722, in safe_execfile
    self.compile if shell_futures else None)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/IPython/utils/py3compat.py", line 168, in execfile
    exec(compiler(f.read(), fname, 'exec'), glob, loc)
  File "/home/randy/src/DeepLabCut/examples/testscript.py", line 352, in <module>
    deeplabcut.train_network(path_config_file, shuffle=2, allow_growth=True)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/deeplabcut/pose_estimation_tensorflow/training.py", line 189, in train_network
    allow_growth=allow_growth,
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/deeplabcut/pose_estimation_tensorflow/train.py", line 171, in train
    batch, enqueue_op, placeholders = setup_preloading(batch_spec)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/deeplabcut/pose_estimation_tensorflow/train.py", line 77, in setup_preloading
    enqueue_op = q.enqueue(placeholders_list)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/tensorflow/python/ops/data_flow_ops.py", line 345, in enqueue
    self._queue_ref, vals, name=scope)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 4158, in queue_enqueue_v2
    timeout_ms=timeout_ms, name=name)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
    op_def=op_def)
  File "/usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 1801, in __init__
    self._traceback = tf_stack.extract_stack()

CancelledError (see above for traceback): Enqueue operation was cancelled
	 [[node fifo_queue_enqueue (defined at /usr/local/bin/anaconda3/envs/DLC/lib/python3.7/site-packages/deeplabcut/pose_estimation_tensorflow/train.py:77) ]]

cancel error is just what happens when you quit the TF session, so not to worry.

Glad to hear its up and going now for you!

Thanks so much for all your help!

no problem! Glad you’re using DLC :slight_smile: thank you - hope all is well at CU!