You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
But the network continously reports error if you try to add a batch size to the input, e.g.:
x = torch.randn(32, 4, 256, 128).to("cuda") # (where 32 is the batch size)
You get the following error:
File "/home/carlosgomezh/.local/lib/python3.10/site-packages/xlstm/blocks/mlstm/layer.py", line 102, in forward
B, S, _ = x.shape
ValueError: too many values to unpack (expected 3)
In your case it is a backbone processing a single tensor.
Is it possible to process something like this:
if __name__ == "__main__":
# Define model hyperparameters
input_dim = 6
hidden_dim = 128
output_dim = 1
num_layers = 2
context_length = 10
# Instantiate the model
model = xLSTM(input_dim, hidden_dim, output_dim, num_layers, context_length).to('cuda')
# Print the model structure
print(model)
# Example dummy input (batch_size=32, sequence_length=10, input_dim=6)
dummy_input = torch.randn(32, context_length, input_dim).to('cuda')
# Forward pass through the model
output = model(dummy_input)
print(output.shape)
Where you have 6 inputs, the h_dim of the network is 128 (for example), output dim is 1, and the context length is 10? Obviously 32 represents the batch size.
If I run that code, I get the following error:
File "/home/carlosgomezh/.local/lib/python3.10/site-packages/torch/nn/functional.py", line 2573, in layer_norm
return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: Given normalized_shape=[128], expected input with shape [*, 128], but got input of size[32, 10, 6]
@Cram3r95 I think you have the wrong approach here, the size 4 in your example above is already considered the batch size, as the heads are only internal and not exposed.
This code is working:
But the network continously reports error if you try to add a batch size to the input, e.g.:
x = torch.randn(32, 4, 256, 128).to("cuda") # (where 32 is the batch size)
You get the following error:
File "/home/carlosgomezh/.local/lib/python3.10/site-packages/xlstm/blocks/mlstm/layer.py", line 102, in forward
B, S, _ = x.shape
ValueError: too many values to unpack (expected 3)
In your case it is a backbone processing a single tensor.
Is it possible to process something like this:
Where you have 6 inputs, the h_dim of the network is 128 (for example), output dim is 1, and the context length is 10? Obviously 32 represents the batch size.
If I run that code, I get the following error:
File "/home/carlosgomezh/.local/lib/python3.10/site-packages/torch/nn/functional.py", line 2573, in layer_norm
return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: Given normalized_shape=[128], expected input with shape [*, 128], but got input of size[32, 10, 6]
@kpoeppel @maximilianmbeck
The text was updated successfully, but these errors were encountered: