You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all thank you very much for this repo, it seems like your solution is just the one I needed.
I wanted my batches have the equal amount of samples from each class (in this case 10 samples from each MNIST class). When I wanted to use it in torch.utils.data.DataLoader as sampler argument with the batch size as 100, the result has larger size than it is supposed to be.
For example, the code below creates 675 trainloader items (this term might be wrong) instead of 600:
I figured it out. Its because of the imbalance in your dataset. The while loop will keep running as long as the count is less than the balanced_max value. Hence if your balance_max (which is the number of samples from your largest class) is very large and the other class counts are very less, then in order to cover all the samples from the largest class, additional batches will be created.
I used the conventional MNIST dataset for this and that's not unbalanced.
After I couldn't find the solution for that I moved on to something else, but will take another look at the implementation that I did along with your comment.
Hello Federico,
First of all thank you very much for this repo, it seems like your solution is just the one I needed.
I wanted my batches have the equal amount of samples from each class (in this case 10 samples from each MNIST class). When I wanted to use it in torch.utils.data.DataLoader as sampler argument with the batch size as 100, the result has larger size than it is supposed to be.
For example, the code below creates 675 trainloader items (this term might be wrong) instead of 600:
I attach the result I see from spyder IDE.
Am I missing something, shouldn't it be 600 instead of 675?
Thank you in advance.
The text was updated successfully, but these errors were encountered: