Error when adding new bodyparts and retraining network

Hi All,

I am still relatively new to deeplabcut, so I apologise if I don’t provide the right/sufficient information in this post now. In this case please let me know. Thank you in advance for your help! :slight_smile:

OS: Windows 10
DeepLabCut Version: 2.2b7
Anaconda env used: DLC-GPU (Spyder 4.1.4)
Python: 3.7.7
GPU: Nvidia Geforce GTX 1050

Problem: I have an existing project (one video (<100 frames), pre-trained resnet 101, full_human model, it works with the pre-defined body parts) and had added 3 new bodyparts in the config file (since I want to analyse gait from the saggital plane I made the markers for the right body parts into a comment), then extracted frames, labelled them, checked them, created a training set, changed the pose_config.yaml to be resnet 101 but once I run the training I get an error. And as a consequence I can then not analyse videos.

So far I have tried several different changes in the bodyparts of the config file.

Code to create the model:

video_path_base_video = 'C:/Users/danie/Videos/Gait Analysis Project/Project/New Videos/New Video3/Video3.mp4'
ProjectFolderName = 'Full Model'
YourName = 'Daniel Pauser'
model2use = 'full_human'
videotype_creation = 'mp4'
model1 = deeplabcut.create_pretrained_project(ProjectFolderName, YourName, ["C:/Users/danie/Videos/Gait Analysis Project/Project/New Videos/New Video3/Video3.mp4"], model=model2use, working_directory=path_model_creation, copy_videos=True, videotype=videotype_creation, analyzevideo=True)

Code to change labels:

initial_config_path = 'C:/Users/danie/Videos/Gait Analysis Project/Project/Full Model-Daniel Pauser-2020-09-15/config.yaml'

deeplabcut.extract_frames(initial_config_path)

deeplabcut.label_frames(initial_config_path)

deeplabcut.check_labels(initial_config_path)

deeplabcut.create_training_dataset(initial_config_path)

deeplabcut.train_network(initial_config_path, maxiters=1000, saveiters=1000)

I had also changed the maxiters and saveiters in the pose_config.yaml file but wanted to make sure I have a short run-time for the test

I have enclosed my config file:
config.yaml (1.4 KB)

Resulting output from training:

deeplabcut.train_network(initial_config_path, maxiters=1000, saveiters=1000)
Config:
{‘all_joints’: [[0], [1], [2], [3], [4], [5], [6], [7], [8], [9], [10]],
‘all_joints_names’: [‘hip21’,
‘hip2’,
‘knee2’,
‘ankle2’,
‘heel2’,
‘toe2’,
‘shoulder2’,
‘elbow2’,
‘wrist2’,
‘chin’,
‘forehead’],
‘batch_size’: 1,
‘bottomheight’: 400,
‘crop’: True,
‘crop_pad’: 0,
‘cropratio’: 0.4,
‘dataset’: 'training-datasets\iteration-0\UnaugmentedDataSet_Full ’
‘ModelSep15\Full Model_Daniel Pauser95shuffle1.mat’,
‘dataset_type’: ‘default’,
‘deterministic’: False,
‘display_iters’: 1000,
‘fg_fraction’: 0.25,
‘global_scale’: 0.8,
‘init_weights’: ‘C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\deeplabcut\pose_estimation_tensorflow\models\pretrained\resnet_v1_101.ckpt’,
‘intermediate_supervision’: False,
‘intermediate_supervision_layer’: 12,
‘leftwidth’: 400,
‘location_refinement’: True,
‘locref_huber_loss’: True,
‘locref_loss_weight’: 0.05,
‘locref_stdev’: 7.2801,
‘log_dir’: ‘log’,
‘max_input_size’: 1500,
‘mean_pixel’: [123.68, 116.779, 103.939],
‘metadataset’: 'training-datasets\iteration-0\UnaugmentedDataSet_Full ’
‘ModelSep15\Documentation_data-Full Model_95shuffle1.pickle’,
‘min_input_size’: 64,
‘minsize’: 100,
‘mirror’: False,
‘multi_step’: [[0.005, 10000],
[0.02, 430000],
[0.002, 730000],
[0.001, 1030000]],
‘net_type’: ‘resnet_101’,
‘num_joints’: 11,
‘num_outputs’: 1,
‘optimizer’: ‘sgd’,
‘pairwise_huber_loss’: False,
‘pairwise_predict’: False,
‘partaffinityfield_predict’: False,
‘pos_dist_thresh’: 17,
‘project_path’: 'C:/Users/danie/Videos/Gait Analysis Project/Project/Full ’
‘Model-Daniel Pauser-2020-09-15’,
‘regularize’: False,
‘rightwidth’: 400,
‘save_iters’: 50000,
‘scale_jitter_lo’: 0.5,
‘scale_jitter_up’: 1.25,
‘scoremap_dir’: ‘test’,
‘shuffle’: True,
‘snapshot_prefix’: 'C:\Users\danie\Videos\Gait Analysis ’
'Project\Project\Full Model-Daniel ’
'Pauser-2020-09-15\dlc-models\iteration-0\Full ’
‘ModelSep15-trainset95shuffle1\train\snapshot’,
‘stride’: 8.0,
‘topheight’: 400,
‘weigh_negatives’: False,
‘weigh_only_present_joints’: False,
‘weigh_part_predictions’: False,
‘weight_decay’: 0.0001}
Selecting single-animal trainer
Switching batchsize to 1, as default/tensorpack/deterministic loaders do not support batches >1. Use imgaug loader.
Starting with standard pose-dataset loader.
Initializing ResNet
Loading ImageNet-pretrained resnet_101

2020-09-15 10:22:28.354859: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2020-09-15 10:22:29.942898: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties:
name: GeForce GTX 1050 major: 6 minor: 1 memoryClockRate(GHz): 1.493
pciBusID: 0000:01:00.0
totalMemory: 4.00GiB freeMemory: 3.30GiB
2020-09-15 10:22:29.945471: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2020-09-15 10:22:32.646027: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-15 10:22:32.648006: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2020-09-15 10:22:32.648648: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2020-09-15 10:22:32.652025: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3011 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
0%| | 0/72 [00:00<?, ?it/s]C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\deeplabcut\utils\make_labeled_video.py:172: FutureWarning: circle is deprecated in favor of disk.circle will be removed in version 0.19
df_y[ind, index], df_x[ind, index], dotsize, shape=(ny, nx)
100%|##########| 72/72 [00:00<00:00, 367.23it/s]2020-09-15 10:56:42.449408: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2020-09-15 10:56:42.450107: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-15 10:56:42.450748: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2020-09-15 10:56:42.451208: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2020-09-15 10:56:42.451780: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3011 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-09-15 11:00:07.490325: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.15GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2020-09-15 11:07:40.255557: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2020-09-15 11:07:40.255978: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-15 11:07:40.256416: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2020-09-15 11:07:40.256690: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2020-09-15 11:07:40.257029: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3011 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-09-15 11:08:03.281970: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.25GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2020-09-15 11:08:03.294719: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.26GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2020-09-15 11:08:03.387523: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.25GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2020-09-15 11:08:03.391215: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.26GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2020-09-15 11:08:03.483609: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.25GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2020-09-15 11:08:03.487245: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.25GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2020-09-15 11:08:04.963798: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.13GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2020-09-15 11:10:08.060852: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2020-09-15 11:10:08.061277: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-15 11:10:08.061721: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2020-09-15 11:10:08.061997: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2020-09-15 11:10:08.062333: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3011 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-09-15 11:17:07.280459: W tensorflow/core/kernels/queue_base.cc:277] _2_fifo_queue: Skipping cancelled enqueue attempt with queue not closed
2020-09-15 11:19:54.674670: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2020-09-15 11:19:54.675094: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-15 11:19:54.675530: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2020-09-15 11:19:54.675801: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2020-09-15 11:19:54.676142: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3011 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-09-15 11:22:35.084329: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2020-09-15 11:22:35.084791: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-15 11:22:35.085260: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2020-09-15 11:22:35.085534: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2020-09-15 11:22:35.085868: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3011 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
Max_iters overwritten as 1000
Save_iters overwritten as 1000
Training parameter:
{‘stride’: 8.0, ‘weigh_part_predictions’: False, ‘weigh_negatives’: False, ‘fg_fraction’: 0.25, ‘mean_pixel’: [123.68, 116.779, 103.939], ‘shuffle’: True, ‘snapshot_prefix’: ‘C:\Users\danie\Videos\Gait Analysis Project\Project\Full Model-Daniel Pauser-2020-09-15\dlc-models\iteration-0\Full ModelSep15-trainset95shuffle1\train\snapshot’, ‘log_dir’: ‘log’, ‘global_scale’: 0.8, ‘location_refinement’: True, ‘locref_stdev’: 7.2801, ‘locref_loss_weight’: 0.05, ‘locref_huber_loss’: True, ‘optimizer’: ‘sgd’, ‘intermediate_supervision’: False, ‘intermediate_supervision_layer’: 12, ‘regularize’: False, ‘weight_decay’: 0.0001, ‘mirror’: False, ‘crop_pad’: 0, ‘scoremap_dir’: ‘test’, ‘batch_size’: 1, ‘dataset_type’: ‘default’, ‘deterministic’: False, ‘weigh_only_present_joints’: False, ‘pairwise_huber_loss’: False, ‘partaffinityfield_predict’: False, ‘pairwise_predict’: False, ‘crop’: True, ‘cropratio’: 0.4, ‘minsize’: 100, ‘leftwidth’: 400, ‘rightwidth’: 400, ‘topheight’: 400, ‘bottomheight’: 400, ‘dataset’: ‘training-datasets\iteration-0\UnaugmentedDataSet_Full ModelSep15\Full Model_Daniel Pauser95shuffle1.mat’, ‘num_joints’: 11, ‘all_joints’: [[0], [1], [2], [3], [4], [5], [6], [7], [8], [9], [10]], ‘all_joints_names’: [‘hip21’, ‘hip2’, ‘knee2’, ‘ankle2’, ‘heel2’, ‘toe2’, ‘shoulder2’, ‘elbow2’, ‘wrist2’, ‘chin’, ‘forehead’], ‘net_type’: ‘resnet_101’, ‘init_weights’: ‘C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\deeplabcut\pose_estimation_tensorflow\models\pretrained\resnet_v1_101.ckpt’, ‘num_outputs’: 1, ‘display_iters’: 1000, ‘max_input_size’: 1500, ‘metadataset’: ‘training-datasets\iteration-0\UnaugmentedDataSet_Full ModelSep15\Documentation_data-Full Model_95shuffle1.pickle’, ‘min_input_size’: 64, ‘multi_step’: [[0.005, 10000], [0.02, 430000], [0.002, 730000], [0.001, 1030000]], ‘pos_dist_thresh’: 17, ‘project_path’: ‘C:/Users/danie/Videos/Gait Analysis Project/Project/Full Model-Daniel Pauser-2020-09-15’, ‘save_iters’: 50000, ‘scale_jitter_lo’: 0.5, ‘scale_jitter_up’: 1.25}
Starting training…
iteration: 1000 loss: 0.0235 lr: 0.005
The network is now trained and ready to evaluate. Use the function ‘evaluate_network’ to evaluate the network.
Exception in thread Thread-35:
Traceback (most recent call last):
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\tensorflow\python\client\session.py”, line 1334, in _do_call
return fn(*args)
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\tensorflow\python\client\session.py”, line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\tensorflow\python\client\session.py”, line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.CancelledError: Enqueue operation was cancelled
[[{{node fifo_queue_enqueue}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\threading.py”, line 926, in _bootstrap_inner
self.run()
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\threading.py”, line 870, in run
self._target(*self._args, **self._kwargs)
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\deeplabcut\pose_estimation_tensorflow\train.py”, line 91, in load_and_enqueue
sess.run(enqueue_op, feed_dict=food)
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\tensorflow\python\client\session.py”, line 929, in run
run_metadata_ptr)
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\tensorflow\python\client\session.py”, line 1152, in _run
feed_dict_tensor, options, run_metadata)
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\tensorflow\python\client\session.py”, line 1328, in _do_run
run_metadata)
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\tensorflow\python\client\session.py”, line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.CancelledError: Enqueue operation was cancelled
[[node fifo_queue_enqueue (defined at C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\deeplabcut\pose_estimation_tensorflow\train.py:77) ]]

Caused by op ‘fifo_queue_enqueue’, defined at:
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\runpy.py”, line 193, in _run_module_as_main
main”, mod_spec)
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\runpy.py”, line 85, in run_code
exec(code, run_globals)
File "C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\spyder_kernels\console_main
.py", line 23, in
start.main()
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\spyder_kernels\console\start.py”, line 332, in main
kernel.start()
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\ipykernel\kernelapp.py”, line 612, in start
self.io_loop.start()
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\tornado\platform\asyncio.py”, line 149, in start
self.asyncio_loop.run_forever()
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\asyncio\base_events.py”, line 541, in run_forever
self._run_once()
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\asyncio\base_events.py”, line 1786, in _run_once
handle._run()
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\asyncio\events.py”, line 88, in _run
self._context.run(self._callback, *self._args)
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\tornado\ioloop.py”, line 690, in
lambda f: self._run_callback(functools.partial(callback, future))
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\tornado\ioloop.py”, line 743, in _run_callback
ret = callback()
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\tornado\gen.py”, line 787, in inner
self.run()
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\tornado\gen.py”, line 748, in run
yielded = self.gen.send(value)
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\ipykernel\kernelbase.py”, line 365, in process_one
yield gen.maybe_future(dispatch(*args))
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\tornado\gen.py”, line 209, in wrapper
yielded = next(result)
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\ipykernel\kernelbase.py”, line 268, in dispatch_shell
yield gen.maybe_future(handler(stream, idents, msg))
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\tornado\gen.py”, line 209, in wrapper
yielded = next(result)
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\ipykernel\kernelbase.py”, line 545, in execute_request
user_expressions, allow_stdin,
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\tornado\gen.py”, line 209, in wrapper
yielded = next(result)
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\ipykernel\ipkernel.py”, line 306, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\ipykernel\zmqshell.py”, line 536, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File “C:\Users\danie\AppData\Roaming\Python\Python37\site-packages\IPython\core\interactiveshell.py”, line 2877, in run_cell
raw_cell, store_history, silent, shell_futures)
File “C:\Users\danie\AppData\Roaming\Python\Python37\site-packages\IPython\core\interactiveshell.py”, line 2922, in _run_cell
return runner(coro)
File “C:\Users\danie\AppData\Roaming\Python\Python37\site-packages\IPython\core\async_helpers.py”, line 68, in pseudo_sync_runner
coro.send(None)
File “C:\Users\danie\AppData\Roaming\Python\Python37\site-packages\IPython\core\interactiveshell.py”, line 3146, in run_cell_async
interactivity=interactivity, compiler=compiler, result=result)
File “C:\Users\danie\AppData\Roaming\Python\Python37\site-packages\IPython\core\interactiveshell.py”, line 3337, in run_ast_nodes
if (await self.run_code(code, result, async
=asy)):
File “C:\Users\danie\AppData\Roaming\Python\Python37\site-packages\IPython\core\interactiveshell.py”, line 3417, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File “”, line 1, in
deeplabcut.train_network(initial_config_path, maxiters=1000, saveiters=1000)
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\deeplabcut\pose_estimation_tensorflow\training.py”, line 189, in train_network
allow_growth=allow_growth,
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\deeplabcut\pose_estimation_tensorflow\train.py”, line 171, in train
batch, enqueue_op, placeholders = setup_preloading(batch_spec)
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\deeplabcut\pose_estimation_tensorflow\train.py”, line 77, in setup_preloading
enqueue_op = q.enqueue(placeholders_list)
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\tensorflow\python\ops\data_flow_ops.py”, line 345, in enqueue
self._queue_ref, vals, name=scope)
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\tensorflow\python\ops\gen_data_flow_ops.py”, line 4158, in queue_enqueue_v2
timeout_ms=timeout_ms, name=name)
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\tensorflow\python\framework\op_def_library.py”, line 788, in _apply_op_helper
op_def=op_def)
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\tensorflow\python\util\deprecation.py”, line 507, in new_func
return func(*args, **kwargs)
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\tensorflow\python\framework\ops.py”, line 3300, in create_op
op_def=op_def)
File “C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\tensorflow\python\framework\ops.py”, line 1801, in init
self._traceback = tf_stack.extract_stack()

CancelledError (see above for traceback): Enqueue operation was cancelled
[[node fifo_queue_enqueue (defined at C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\deeplabcut\pose_estimation_tensorflow\train.py:77) ]]

2020-09-15 10:22:28.354859: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2020-09-15 10:22:29.942898: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties:
name: GeForce GTX 1050 major: 6 minor: 1 memoryClockRate(GHz): 1.493
pciBusID: 0000:01:00.0
totalMemory: 4.00GiB freeMemory: 3.30GiB
2020-09-15 10:22:29.945471: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2020-09-15 10:22:32.646027: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-15 10:22:32.648006: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2020-09-15 10:22:32.648648: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2020-09-15 10:22:32.652025: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3011 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
0%| | 0/72 [00:00<?, ?it/s]C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\deeplabcut\utils\make_labeled_video.py:172: FutureWarning: circle is deprecated in favor of disk.circle will be removed in version 0.19
df_y[ind, index], df_x[ind, index], dotsize, shape=(ny, nx)
100%|##########| 72/72 [00:00<00:00, 367.23it/s]2020-09-15 10:56:42.449408: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2020-09-15 10:56:42.450107: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-15 10:56:42.450748: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2020-09-15 10:56:42.451208: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2020-09-15 10:56:42.451780: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3011 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-09-15 11:00:07.490325: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.15GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2020-09-15 11:07:40.255557: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2020-09-15 11:07:40.255978: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-15 11:07:40.256416: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2020-09-15 11:07:40.256690: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2020-09-15 11:07:40.257029: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3011 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-09-15 11:08:03.281970: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.25GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2020-09-15 11:08:03.294719: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.26GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2020-09-15 11:08:03.387523: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.25GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2020-09-15 11:08:03.391215: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.26GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2020-09-15 11:08:03.483609: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.25GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2020-09-15 11:08:03.487245: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.25GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2020-09-15 11:08:04.963798: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.13GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2020-09-15 11:08:05.458228: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.13GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2020-09-15 11:08:05.461130: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.13GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2020-09-15 11:10:08.060852: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2020-09-15 11:10:08.061277: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-15 11:10:08.061721: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2020-09-15 11:10:08.061997: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2020-09-15 11:10:08.062333: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3011 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-09-15 11:17:07.280459: W tensorflow/core/kernels/queue_base.cc:277] _2_fifo_queue: Skipping cancelled enqueue attempt with queue not closed
2020-09-15 11:19:54.674670: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2020-09-15 11:19:54.675094: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-15 11:19:54.675530: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2020-09-15 11:19:54.675801: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2020-09-15 11:19:54.676142: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3011 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-09-15 11:22:35.084329: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2020-09-15 11:22:35.084791: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-15 11:22:35.085260: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2020-09-15 11:22:35.085534: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2020-09-15 11:22:35.085868: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3011 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-09-15 11:28:26.292764: W tensorflow/core/kernels/queue_base.cc:277] _3_fifo_queue: Skipping cancelled enqueue attempt with queue not closed

Hi, welcome! The issue is that the pretrained network doesn’t have your body parts in the output head, so you are getting errors about this mismatch.

@AlexanderMathis do we have a simple guide somewhere on modifying this correctly?

‘init_weights’: ‘C:\Users\danie\anaconda3\envs\DLC-GPU1\lib\site-packages\deeplabcut\pose_estimation_tensorflow\models\pretrained\resnet_v1_101.ckpt’,

Thank you very much for your prompt reply! I see, at least it seems to be something fixable so that is great, though I am not sure how :smiley:
A guide would of course be perfect and greatly appreciated!

@MWMathis, thank you again for your reply. Any solution would really be greatly appreciated. Is there anything I can already try and do?

Well, the easiest (albeit suboptimal) way is to:

deeplabcut.train_network(initial_config_path, maxiters=1000, saveiters=1000, keepdeconvweights=False)
keepdeconvweights: bool, default: true
    Also restores the weights of the deconvolution layers (and the backbone) when training from a snapshot. Note that if you change the number of bodyparts, you need to
    set this to false for re-training.

This is suboptimal, as you loose the trained deconvolution weights for the bodyparts that remain, but a method for keeping those selectively is currently not available.

@AlexanderMathis, thank you very much for your reply and the solution.
Just to clarify, in this scenario I would be loosing the full_human weights for the other bodyparts?
Thank you again for your help, I really appreciate it!

You only “loose” the last layer – the ResNet keeps it’s adapted weights for MPII-Pose!

1 Like