Eoferror when training multiple files #1675

JRMeyer · 2021-03-08T08:28:57Z

JRMeyer
Mar 8, 2021
Maintainer

>>> abdullah.tayyab
[December 15, 2020, 1:11am]

Hi there,

I also posted on Testing for correctness of the
samples
but I think my issue warrants a separate topic so I can provide full
context of what I am trying to achieve.

I am trying to create an ASR system for Urdu (native Pakistani language)
using the Ubuntu 16.04 Deep Learning AMI from AWS. I have installed
DeepSpeech (0.9.2) in a virtualenv as dictated in the documentation. I
have been testing with one data source using various configurations and
train/test/dev files have worked fine with that one data source. I have
assembled various data sources with decent transcriptions to expand the
data set.

I have generated a separate scoring file for Urdu as pointed out in
multiple topics in discourse. The command I am using the execute is: slash
python3 DeepSpeech.py --drop_source_layers 1 --alphabet_config_path /$HOME/Uploads/UrduAlphabet_newscrawl2.txt --checkpoint_dir /$HOME/DeepSpeech/dataset/trained_load_checkpoint --train_files /$HOME/Uploads/trainbusiness.csv --dev_files /$HOME/Uploads/devbusiness.csv --test_files /$HOME/Uploads/testbusiness.csv --epochs 2 --train_batch_size 32 --export_dir /$HOME/DeepSpeech/dataset/urdu_trained --export_file_name urdu --test_batch_size 12 --learning_rate 0.00001 --reduce_lr_on_plateau true --scorer /$HOME/Uploads/kenlm.scorer

Here comes the interesting part... this works perfectly when I execute
this separately for all the different data sources. I have been
generating training, test, and dev files for each data source and there
were no issues when I used those files. I get the exception below when I
try to combine the csv files and run the whole data set together.
Obviously, I want to do that so there is more data and I can execute the
whole data set for a high number of epochs.

I Loading best validating checkpoint from //home/ubuntu/DeepSpeech/dataset/trained_load_checkpoint/best_dev-150
I Loading variable from checkpoint: beta1_power
I Loading variable from checkpoint: beta2_power
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias/Adam
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias/Adam_1
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel/Adam
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel/Adam_1
I Loading variable from checkpoint: global_step
I Loading variable from checkpoint: layer_1/bias
I Loading variable from checkpoint: layer_1/bias/Adam
I Loading variable from checkpoint: layer_1/bias/Adam_1
I Loading variable from checkpoint: layer_1/weights
I Loading variable from checkpoint: layer_1/weights/Adam
I Loading variable from checkpoint: layer_1/weights/Adam_1
I Loading variable from checkpoint: layer_2/bias
I Loading variable from checkpoint: layer_2/bias/Adam
I Loading variable from checkpoint: layer_2/bias/Adam_1
I Loading variable from checkpoint: layer_2/weights
I Loading variable from checkpoint: layer_2/weights/Adam
I Loading variable from checkpoint: layer_2/weights/Adam_1
I Loading variable from checkpoint: layer_3/bias
I Loading variable from checkpoint: layer_3/bias/Adam
I Loading variable from checkpoint: layer_3/bias/Adam_1
I Loading variable from checkpoint: layer_3/weights
I Loading variable from checkpoint: layer_3/weights/Adam
I Loading variable from checkpoint: layer_3/weights/Adam_1
I Loading variable from checkpoint: layer_5/bias
I Loading variable from checkpoint: layer_5/bias/Adam
I Loading variable from checkpoint: layer_5/bias/Adam_1
I Loading variable from checkpoint: layer_5/weights
I Loading variable from checkpoint: layer_5/weights/Adam
I Loading variable from checkpoint: layer_5/weights/Adam_1
I Loading variable from checkpoint: learning_rate
I Initializing variable: layer_6/bias
I Initializing variable: layer_6/bias/Adam
I Initializing variable: layer_6/bias/Adam_1
I Initializing variable: layer_6/weights
I Initializing variable: layer_6/weights/Adam
I Initializing variable: layer_6/weights/Adam_1
I STARTING Optimization
Epoch 0 | Training | Elapsed Time: 0:00:04 | Steps: 1 | Loss: 15.989467 Traceback (most recent call last):
File 'DeepSpeech.py', line 12, in
ds_train.run_script()
File '/home/ubuntu/DeepSpeech/training/deepspeech_training/train.py', line 976, in run_script
absl.app.run(main)
File '/home/ubuntu/tmp/deepspeech-venv/lib/python3.7/site-packages/absl/app.py', line 303, in run
_run_main(main, args)
File '/home/ubuntu/tmp/deepspeech-venv/lib/python3.7/site-packages/absl/app.py', line 251, in _run_main
sys.exit(main(argv))
File '/home/ubuntu/DeepSpeech/training/deepspeech_training/train.py', line 948, in main
train()
File '/home/ubuntu/DeepSpeech/training/deepspeech_training/train.py', line 605, in train
train_loss, _ = run_set('train', epoch, train_init_op)
File '/home/ubuntu/DeepSpeech/training/deepspeech_training/train.py', line 571, in run_set
exception_box.raise_if_set()
File '/home/ubuntu/DeepSpeech/training/deepspeech_training/util/helpers.py', line 123, in raise_if_set
raise exception # pylint: disable = raising-bad-type
File '/home/ubuntu/DeepSpeech/training/deepspeech_training/util/helpers.py', line 131, in do_iterate
yield from iterable()
File '/home/ubuntu/DeepSpeech/training/deepspeech_training/util/feeding.py', line 114, in generate_values
for sample_index, sample in enumerate(samples):
File '/home/ubuntu/DeepSpeech/training/deepspeech_training/util/augmentations.py', line 221, in apply_sample_augmentations
yield from pool.imap(_augment_sample, timed_samples())
File '/home/ubuntu/DeepSpeech/training/deepspeech_training/util/helpers.py', line 102, in imap
for obj in self.pool.imap(fun, self._limit(it)):
File '/home/ubuntu/anaconda3/lib/python3.7/multiprocessing/pool.py', line 748, in next
raise value
EOFError

This is the exception when I have executed the same command on two data
sets and try to improve the training by adding one more data set. All
.wav files have been converted to mono and 16kHz.

I have also used the csv_combiner from
https://github.com/dabinat/deepspeech-tools thinking that my code
wasn't combining them correctly.

I have shared both sources, combined csv files and separate csv files
here. slash
combinedrun.zip
(100.4 KB)
separaterun.zip
(86.1 KB)

Can someone please point me in the right direction?

Thank you!

[This is an archived TTS discussion thread from discourse.mozilla.org/t/eoferror-when-training-multiple-files]

JRMeyer · 2021-03-08T08:28:59Z

JRMeyer
Mar 8, 2021
Maintainer Author

>>> othiele
[December 15, 2020, 8:26am]

This is strange and probably has to do with how you combine the data.
Here are some ideas:

1. Never ever use a space in a filename. I doesn't end well - ever.
(business wavs)

2. I see that the single train and combined train have different
newline chars. I don't think this is the problem, but it is a
difference.

3. Use some dropout for training. Search for values, maybe 0.15-0.3

I guess the error comes right away.

OH, and after looking at it again. The file size changes from single
to combined!!! This looks like a good reason for EOF ...

[Archived Post]

0 replies

JRMeyer · 2021-03-08T08:29:02Z

JRMeyer
Mar 8, 2021
Maintainer Author

>>> abdullah.tayyab
[December 15, 2020, 9:28am]

The files I attached were
from another library, I have attached the ones that I combined myself.
They have almost the same size. slash
combinedrun.zip
(82.7 KB)

[Archived Post]

0 replies

JRMeyer · 2021-03-08T08:29:05Z

JRMeyer
Mar 8, 2021
Maintainer Author

>>> othiele
[December 15, 2020, 9:36am]

You did not attach the real files? I don't get it. As stated above, you
need to organize your files correctly. If the single files work, the
combined files work as well.

Sorry, I don't have the time to review more revisions of the 'actual'
files you used. You will have to do some detective work of your own.

[Archived Post]

0 replies

JRMeyer · 2021-03-08T08:29:07Z

JRMeyer
Mar 8, 2021
Maintainer Author

>>> abdullah.tayyab
[December 15, 2020, 4:57pm]

I did attach the real files. I combined writing some code myself and
through a deep speech util library. The latest ones will have the same
new line characters as they were through my code.

I have only reached out as my detective work did not yield any results.
I have spent days trying to compare these files and the EOFError doesn't
really tell me where to look.

[Archived Post]

0 replies

JRMeyer · 2021-03-08T08:29:10Z

JRMeyer
Mar 8, 2021
Maintainer Author

>>> othiele
[December 15, 2020, 5:06pm]

No, you did not attach the corresponding files as one had different file
sizes from the other. This shows that you don't search systematically
for the error. And it is impossible to find the error for us with that
sort of information.

The problem is very likely the file size or something else with the
files. You need to write import scripts to check them. If the error
occurs at step 1, this is within the first files. Use the reverse flag
option to start from the end. If you get the error again at step 1, all
your files have a problem. You can use limit to check just parts of the
whole file.

[Archived Post]

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eoferror when training multiple files #1675

{{title}}

Replies: 5 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Eoferror when training multiple files #1675

JRMeyer Mar 8, 2021 Maintainer

Replies: 5 comments

JRMeyer Mar 8, 2021 Maintainer Author

JRMeyer Mar 8, 2021 Maintainer Author

JRMeyer Mar 8, 2021 Maintainer Author

JRMeyer Mar 8, 2021 Maintainer Author

JRMeyer Mar 8, 2021 Maintainer Author

JRMeyer
Mar 8, 2021
Maintainer

JRMeyer
Mar 8, 2021
Maintainer Author

JRMeyer
Mar 8, 2021
Maintainer Author

JRMeyer
Mar 8, 2021
Maintainer Author

JRMeyer
Mar 8, 2021
Maintainer Author

JRMeyer
Mar 8, 2021
Maintainer Author