Tensor Size Mismatch During Training #13

odhinnsrunes · 2019-02-14T13:42:41Z

While training at a seemingly random point, it fails with this error (both lstm and gru):

Traceback (most recent call last):
  File "train.py", line 98, in <module>
    loss = train(*random_training_set(args.chunk_len, args.batch_size))
  File "train.py", line 43, in random_training_set
    inp[bi] = char_tensor(chunk[:-1])
RuntimeError: The expanded size of the tensor (200) must match the existing size (199) at non-singleton dimension 0.  Target sizes: [200].  Tensor sizes: [199]

I've attached the data set I've been using.

surnames.txt

The text was updated successfully, but these errors were encountered:

integraloftheday · 2019-02-21T19:44:38Z

I've been having the same issue. However, I only encounter it when I have a large epoch number. What epoch number did you have set in the arguments? By large I mean 100,000 and so.

odhinnsrunes · 2019-02-21T19:46:29Z

I didn't set one, I just used the defaults. It seems to happen somewhat randomly. Sometimes it will get a few epochs in, other times it crashes almost immediately.

ShaneTsui · 2019-02-24T08:05:33Z

I ran into the same problem, and it's indeed caused by this line in train.py:
start_index = random.randint(0, file_len - chunk_len)

Subtract 1 from the right boundary should fix the problem:
start_index = random.randint(0, file_len - chunk_len - 1)

This is because for randint(a, b), the right boundary b is included. So if the sampler happens to select b, the error would raise because the end index is out of the file boundary.

To be specific, for the following 2 lines,

end_index = start_index + chunk_len + 1
chunk = file[start_index:end_index]

the slicing operation woundn't raise an out-of-boundary error, instead, the right boundary would be set to min(file_length, end_index - 1), which would only cut out chunk_len - 1 length of characters.

Fixed: spro#13

ShaneTsui mentioned this issue Feb 24, 2019

Fixed the size-not-matched (or out-of-file-boundary) bug spro/practical-pytorch#127

Open

olivatooo added a commit to olivatooo/char-rnn.pytorch that referenced this issue Jan 6, 2020

Update train.py

5f24da0

Fixed: spro#13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tensor Size Mismatch During Training #13

Tensor Size Mismatch During Training #13

odhinnsrunes commented Feb 14, 2019

integraloftheday commented Feb 21, 2019

odhinnsrunes commented Feb 21, 2019

ShaneTsui commented Feb 24, 2019 •

edited

Loading

Tensor Size Mismatch During Training #13

Tensor Size Mismatch During Training #13

Comments

odhinnsrunes commented Feb 14, 2019

integraloftheday commented Feb 21, 2019

odhinnsrunes commented Feb 21, 2019

ShaneTsui commented Feb 24, 2019 • edited Loading

ShaneTsui commented Feb 24, 2019 •

edited

Loading