-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fail to reproduce the result #3
Comments
Hello, thanks for your message. Could you give the comparison between the reported and reproduced results on the |
Thanks for your reply, I submit the results, which is infered on I use the cmd in readme for inference:
specially, I check my history cmd:
The
About this, I checked my terminal history, the ckpt used for evaluation should be Any idea on what's wrong? |
I didn't see anything wrong and will train with the same script to reproduce the issue. BTW, may I ask if the visual backbone was frozen when training the model? Thanks:D |
the backbone is not freeze for the “output_no_freeze” zip, the “output_freeze” zip is just my own setting to check how the score will be influenced if the backbone is frozen, just my own experiment for a quick check since it takes 8 hours less then the “no_freeze” one, so no need to care about the frozen result. Only need to pay attention to the “no freeze” one when reproducing the report result |
Hello! Any idea about what's going wrong? Or can you provide the specific training cmd setting? Since the |
I got the same incorrect results (0.45) with the provided script, and I conjecture the reason lies in the training epochs and lr_drops (it might be epochs=3 and lr_drop=2, but the provided ones are 2 and 1). I am training with this setting and will let you know this weekend. As for nproc_per_node=1, this is for single-gpu training and should be the same as the number of gpus in your experiment. Thanks! :D |
Thanks for your reply~ |
Hi~Any progress? BTW, I'm curious about the training time for fine-tuning, maybe it could take many hours or days with only one GPU device? |
Hello, sorry for the delay. The training takes several days to complete. Yes, increasing The inconsistency between reported and reproduced performance lies in the parameter Thank you so much for bringing this issue to my attention and for your patience! 🙌 |
Thanks for your patience! I will try it these days~ |
Hi, I just try to reproduce it with the new script, and just remove the There's still some margin between my results and the report. However, the score is 0.47446 now, +0.02 more than the online score without training the text encoder ! I don't know if it is the device or the inferring command that leads to the margin. |
Thanks for your great work, I ft the model with the cmd in readme:
specifically:
Then infer the output on
valid
(online) andvalid_u
(offline)I use the eval script in eval_mevis.py to calculate the offline metrics.
However, here's the result:
BTW, the inference setting are the same.
Any idea on how to reproduce the ckpt result?
The text was updated successfully, but these errors were encountered: