-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hello, could you please give me some advice on why the size of the TinyCLIP-ViT-39M-16-Text-19M.bin model I distilled is not 300mb but 900mb, thanks very much!!! #254
Comments
It also contains the master weight and the optimizer states. ckpt = torch.load(checkpoint_fname)
new_ckpt = dict(state_dict=ckpt['state_dict'])
torch.save(new_ckpt, saved_fname) |
thanks very much!!! |
After I modify the parameters according to this code, the accuracy in cifar is only 0.09, and the following error is displayed, could you please give me some advice, thanks!!! Some weights of the model checkpoint at clip were not used when initializing CLIPModel: ['state_dict']
|
Hello, could you please give me some advice on why the size of the TinyCLIP-ViT-39M-16-Text-19M.bin model I distilled is not 300mb but 900mb, thanks very much!!!
export NNODES=1
export GPUS_PER_NODE=1
export WANDB__SERVICE_WAIT=60
export CUDA_VISIBLE_DEVICES=5
DISTRIBUTED_ARGS="--nproc_per_node $GPUS_PER_NODE --nnodes $NNODES"
torchrun $DISTRIBUTED_ARGS src/training/main.py
--save-frequency 1
--report-to wandb
--train-data /home/gg/gg/MQBench-main/test/model/e1/split_2tar
--dataset-type webdataset
--imagenet-val ./ImageNet
--warmup 2000
--batch-size 1024
--epochs 25
--workers 8
--model TinyCLIP-ViT-39M-16-Text-19M
--name exp_name
--seed 0
--local-loss
--grad-checkpointing
--output ./outputs/TinyCLIP-ViT-39M-16-Text-19M
--lr 0.0001
--gather-with-grad
--pretrained-image-file ViT-B-16@openai
--pretrained-text-file ViT-B-16@openai
--distillation-teacher ViT-B-32@laion2b_e16
--norm_gradient_clip 5
--train-num-samples 15000000
--logit-scale 50
The text was updated successfully, but these errors were encountered: