You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to train srresnet-mse using my own data set. Sometimes I get an error message. The first time it occurred between 0 and 100 eras, then between 100 and 200, then after 600 eras. In my data set there are about one hundred thousand images. I suspect that this is due to my data set. Can you help me understand what the problem is?
/opt/ds/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
Logging results for this session in folder "results/srresnet-mse".
2018-09-03 12:22:04.150374: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-09-03 12:22:07.392768: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:c1:00.0
totalMemory: 10.92GiB freeMemory: 10.76GiB
2018-09-03 12:22:07.392867: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
2018-09-03 12:22:07.811512: I tensorflow/core/common_runtime/gpu/gpu_device.cc:993] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10415 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:c1:00.0, compute capability: 6.1)
[0] Test: 0.4038988, Train: 0.5311046 [Set5] PSNR: 11.46, SSIM: 0.1051 [Set14] PSNR: 12.50, SSIM: 0.0841 [BSD100] PSNR: 13.13, SSIM: 0.1036
[100] Test: 0.2326521, Train: 0.3028869 [Set5] PSNR: 13.47, SSIM: 0.4380 [Set14] PSNR: 14.62, SSIM: 0.4203 [BSD100] PSNR: 15.39, SSIM: 0.4153
2018-09-03 12:23:46.013589: W tensorflow/core/kernels/queue_base.cc:277] _0_input_producer: Skipping cancelled enqueue attempt with queue not closed
2018-09-03 12:23:46.026015: W tensorflow/core/kernels/queue_base.cc:277] _2_input_producer_1: Skipping cancelled enqueue attempt with queue not closed
2018-09-03 12:23:46.026804: W tensorflow/core/kernels/queue_base.cc:277] _5_batch_2/fifo_queue: Skipping cancelled enqueue attempt with queue not closed
2018-09-03 12:23:46.027426: W tensorflow/core/kernels/queue_base.cc:277] _3_batch_1/fifo_queue: Skipping cancelled enqueue attempt with queue not closed
2018-09-03 12:23:46.027743: W tensorflow/core/kernels/queue_base.cc:277] _4_input_producer_2: Skipping cancelled enqueue attempt with queue not closed
Traceback (most recent call last):
File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1361, in _do_call
return fn(*args)
File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1340, in _run_fn
target_list, status, run_metadata)
File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.OutOfRangeError: FIFOQueue '_1_batch/fifo_queue' is closed and has insufficient elements (requested 14, current size 0)
[[Node: batch = QueueDequeueManyV2[component_types=[DT_UINT8], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](batch/fifo_queue, batch/n)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train.py", line 134, in <module>
main()
File "train.py", line 121, in main
batch_hr = sess.run(get_train_batch)
File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 905, in run
run_metadata_ptr)
File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1137, in _run
feed_dict_tensor, options, run_metadata)
File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1355, in _do_run
options, run_metadata)
File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1374, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: FIFOQueue '_1_batch/fifo_queue' is closed and has insufficient elements (requested 14, current size 0)
[[Node: batch = QueueDequeueManyV2[component_types=[DT_UINT8], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](batch/fifo_queue, batch/n)]]
Caused by op 'batch', defined at:
File "train.py", line 134, in <module>
main()
File "train.py", line 68, in main
get_train_batch, get_val_batch, get_eval_batch = build_inputs(args, sess)
File "/home/ds/ykochnev/SRGAN-orig/utilities.py", line 55, in build_inputs
get_train_batch = build_input_pipeline(train_filenames, batch_size=args.batch_size, img_size=args.image_size, random_crop=True)
File "/home/ds/ykochnev/SRGAN-orig/utilities.py", line 36, in build_input_pipeline
image_batch = tf.train.batch([image], batch_size=batch_size, num_threads=num_threads, capacity=10 * batch_size)
File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/input.py", line 989, in batch
name=name)
File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/input.py", line 763, in _batch
dequeued = queue.dequeue_many(batch_size, name=name)
File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/data_flow_ops.py", line 483, in dequeue_many
self._queue_ref, n=n, component_types=self._dtypes, name=name)
File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 2430, in _queue_dequeue_many_v2
component_types=component_types, timeout_ms=timeout_ms, name=name)
File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3271, in create_op
op_def=op_def)
File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1650, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
OutOfRangeError (see above for traceback): FIFOQueue '_1_batch/fifo_queue' is closed and has insufficient elements (requested 14, current size 0)
[[Node: batch = QueueDequeueManyV2[component_types=[DT_UINT8], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](batch/fifo_queue, batch/n)]]
The text was updated successfully, but these errors were encountered:
This error happened to me when there was a corrupted or invalid image in the dataset. I would create a simple script to iterate over your dataset and try to load all the images to find out which one is causing the problem.
I havent found a way to have tensorflow simply ignore the images which fail to load - if anyone knows that would be great.
I'm trying to train srresnet-mse using my own data set. Sometimes I get an error message. The first time it occurred between 0 and 100 eras, then between 100 and 200, then after 600 eras. In my data set there are about one hundred thousand images. I suspect that this is due to my data set. Can you help me understand what the problem is?
The text was updated successfully, but these errors were encountered: