Dropout uses the memory address of seeds instead of reading seeds from memory #173

tridao · 2022-01-04T19:28:03Z

🐛 Bug

From reading the code for k_dropout_fw and k_dropout_bw, it seems to me that the seeds are never read from memory and the code simply uses the memory address of the seed.
For example:

    rand1, rand2, rand3, rand4 = tl.randint4x(seed.to(tl.int32), rand_offsets)

Here seed is still a memory address and not an integer.

As a result, when k_dropout_fw is passed in two identical seed tensors with different memory addresses, it produces different results.

To Reproduce

Setting the Pytorch seed should produce the same seed used in dropout, and should produce the same dropout mask.
However, that's not the case

import torch
from xformers.triton.dropout import dropout

x = torch.randn(3, 5, device='cuda')
print(x)

torch.manual_seed(0)
torch.cuda.manual_seed(0)
print(dropout(x, 0.5))

torch.manual_seed(0)
torch.cuda.manual_seed(0)
print(dropout(x, 0.5))

 tensor([[ 0.4821, -1.6949,  0.8196,  1.9093, -1.0018],
        [ 0.4030, -1.5175, -0.3187, -0.0959,  2.7204],
        [ 1.0645, -0.1254,  0.3978, -2.9882,  0.2232]], device='cuda:0')

tensor([[ 0.9642,  0.0000,  0.0000,  0.0000,  0.0000],
        [ 0.8059, -3.0350,  0.0000, -0.1918,  5.4409],
        [ 0.0000, -0.2507,  0.7955,  0.0000,  0.0000]], device='cuda:0')

tensor([[ 0.9642, -3.3897,  0.0000,  3.8186, -2.0037],
        [ 0.0000,  0.0000, -0.6374,  0.0000,  5.4409],
        [ 2.1290, -0.2507,  0.7955,  0.0000,  0.4464]], device='cuda:0')

PyTorch Version (e.g., 1.0): 1.10.1
OS (e.g., Linux): Ubuntu 20.04
How you installed PyTorch (conda, pip, source): conda
Build command you used (if compiling from source): pip install -e . (from master)
Python version: 3.8
CUDA/cuDNN version: 11.3
GPU models and configuration: V100

The text was updated successfully, but these errors were encountered:

blefaudeux · 2022-01-07T06:47:31Z

oh just seeing that, sorry for the delay !

blefaudeux · 2022-01-07T06:49:41Z

I thought that I had a unit test for that, but actually that was not the case (see). You're right, thanks for the report @tridao, it should be an easy fix

blefaudeux · 2022-01-07T06:59:22Z

should be fixed, and sanity checking the speed it's not really changed

blefaudeux added a commit that referenced this issue Jan 7, 2022

Fix #173

e812e67

blefaudeux mentioned this issue Jan 7, 2022

Fix #173, dropout repeatibility / mishandled seeds #175

Merged

10 tasks

blefaudeux closed this as completed in #175 Jan 7, 2022

blefaudeux added a commit that referenced this issue Jan 7, 2022

Fix #173 (#175)

3000421

xwhan pushed a commit to xwhan/xformers that referenced this issue Feb 8, 2022

[nit] Add ETA to LRA logs (facebookresearch#173)

e441f89

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dropout uses the memory address of seeds instead of reading seeds from memory #173

Dropout uses the memory address of seeds instead of reading seeds from memory #173

tridao commented Jan 4, 2022

blefaudeux commented Jan 7, 2022

blefaudeux commented Jan 7, 2022

blefaudeux commented Jan 7, 2022

Dropout uses the *memory address* of seeds instead of reading seeds from memory #173

Dropout uses the *memory address* of seeds instead of reading seeds from memory #173

Comments

tridao commented Jan 4, 2022

🐛 Bug

To Reproduce

blefaudeux commented Jan 7, 2022

blefaudeux commented Jan 7, 2022

blefaudeux commented Jan 7, 2022

Dropout uses the memory address of seeds instead of reading seeds from memory #173

Dropout uses the memory address of seeds instead of reading seeds from memory #173