Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dropout uses the *memory address* of seeds instead of reading seeds from memory #173

Closed
tridao opened this issue Jan 4, 2022 · 3 comments · Fixed by #175
Closed

Dropout uses the *memory address* of seeds instead of reading seeds from memory #173

tridao opened this issue Jan 4, 2022 · 3 comments · Fixed by #175

Comments

@tridao
Copy link

tridao commented Jan 4, 2022

🐛 Bug

From reading the code for k_dropout_fw and k_dropout_bw, it seems to me that the seeds are never read from memory and the code simply uses the memory address of the seed.
For example:

    rand1, rand2, rand3, rand4 = tl.randint4x(seed.to(tl.int32), rand_offsets)

Here seed is still a memory address and not an integer.

As a result, when k_dropout_fw is passed in two identical seed tensors with different memory addresses, it produces different results.

To Reproduce

Setting the Pytorch seed should produce the same seed used in dropout, and should produce the same dropout mask.
However, that's not the case

import torch
from xformers.triton.dropout import dropout

x = torch.randn(3, 5, device='cuda')
print(x)

torch.manual_seed(0)
torch.cuda.manual_seed(0)
print(dropout(x, 0.5))

torch.manual_seed(0)
torch.cuda.manual_seed(0)
print(dropout(x, 0.5))
 tensor([[ 0.4821, -1.6949,  0.8196,  1.9093, -1.0018],
        [ 0.4030, -1.5175, -0.3187, -0.0959,  2.7204],
        [ 1.0645, -0.1254,  0.3978, -2.9882,  0.2232]], device='cuda:0')

tensor([[ 0.9642,  0.0000,  0.0000,  0.0000,  0.0000],
        [ 0.8059, -3.0350,  0.0000, -0.1918,  5.4409],
        [ 0.0000, -0.2507,  0.7955,  0.0000,  0.0000]], device='cuda:0')

tensor([[ 0.9642, -3.3897,  0.0000,  3.8186, -2.0037],
        [ 0.0000,  0.0000, -0.6374,  0.0000,  5.4409],
        [ 2.1290, -0.2507,  0.7955,  0.0000,  0.4464]], device='cuda:0')
  • PyTorch Version (e.g., 1.0): 1.10.1
  • OS (e.g., Linux): Ubuntu 20.04
  • How you installed PyTorch (conda, pip, source): conda
  • Build command you used (if compiling from source): pip install -e . (from master)
  • Python version: 3.8
  • CUDA/cuDNN version: 11.3
  • GPU models and configuration: V100
@blefaudeux
Copy link
Contributor

oh just seeing that, sorry for the delay !

@blefaudeux
Copy link
Contributor

I thought that I had a unit test for that, but actually that was not the case (see). You're right, thanks for the report @tridao, it should be an easy fix

blefaudeux added a commit that referenced this issue Jan 7, 2022
@blefaudeux
Copy link
Contributor

should be fixed, and sanity checking the speed it's not really changed

blefaudeux added a commit that referenced this issue Jan 7, 2022
xwhan pushed a commit to xwhan/xformers that referenced this issue Feb 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants