Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qwen以 bf16 pt后加载推理,报Only one of "bf16", "fp16", "fp32" can be true错误 #818

Closed
lrh000 opened this issue Sep 7, 2023 · 5 comments
Labels
solved This problem has been already solved

Comments

@lrh000
Copy link

lrh000 commented Sep 7, 2023

pt时的配置
deepspeed --num_gpus 2 src/train_bash.py \ --stage pt \ --model_name_or_path /home/a8002/LLMs/Qwen-7B-Chat/ \ --do_train \ --dataset pt_course_xd \ --template chatglm2 \ --finetuning_type full \ --lora_target query_key_value \ --output_dir outputs/pt_b4_g16 \ --overwrite_cache \ --per_device_train_batch_size 4 \ --gradient_accumulation_steps 16 \ --lr_scheduler_type cosine \ --logging_steps 10 \ --save_steps 1900 \ --learning_rate 5e-5 \ --num_train_epochs 3.0 \ --bf16 \ --deepspeed ds_config.json \ --preprocessing_num_workers 8 \ --max_source_length 1024
dsconfig配置
{ "train_micro_batch_size_per_gpu": "auto", "train_batch_size": "auto", "gradient_accumulation_steps": "auto", "gradient_clipping": "auto", "zero_allow_untested_optimizer": true, "bf16":{ "enabled": true }, "fp16": { "enabled": false }, "zero_optimization": { "stage": 3, "overlap_comm": true, "contiguous_gradients": true, "sub_group_size": 1e9, "reduce_bucket_size": "auto", "stage3_prefetch_bucket_size": "auto", "stage3_param_persistence_threshold": "auto", "stage3_max_live_parameters": 2e9, "stage3_max_reuse_distance": 2e9, "stage3_gather_16bit_weights_on_model_save": true } }

保存模型的config
{ "_name_or_path": "/home/a8002/LLMs/Qwen-7B-Chat/", "architectures": [ "QWenLMHeadModel" ], "attn_dropout_prob": 0.0, "auto_map": { "AutoConfig": "configuration_qwen.QWenConfig", "AutoModel": "modeling_qwen.QWenLMHeadModel", "AutoModelForCausalLM": "modeling_qwen.QWenLMHeadModel" }, "bf16": true, "emb_dropout_prob": 0.0, "fp16": false, "fp32": false, "hidden_size": 4096, "initializer_range": 0.02, "intermediate_size": 22016, "kv_channels": 128, "layer_norm_epsilon": 1e-06, "max_position_embeddings": 8192, "model_type": "qwen", "no_bias": true, "num_attention_heads": 32, "num_hidden_layers": 32, "onnx_safe": null, "rotary_emb_base": 10000, "rotary_pct": 1.0, "scale_attn_weights": true, "seq_length": 2048, "tie_word_embeddings": false, "tokenizer_type": "QWenTokenizer", "torch_dtype": "bfloat16", "transformers_version": "4.31.0", "use_cache": true, "use_dynamic_ntk": true, "use_flash_attn": true, "use_logn_attn": true, "vocab_size": 151936 }
使用cli_demo加载
python src/cli_demo.py --model_name_or_path PATH_to_PT_QWEN --template chatml

报错
image

目前的解决方案:
把自行训练保存的config里bf16、fp16、fp32全部改为false,可以加载。但是生成重复不停止,直到max_generating_length,和#778 (comment) 情况一致。
image

@hiyouga hiyouga added the pending This problem is yet to be addressed label Sep 7, 2023
@SolarKnight1
Copy link

你用的--template chatglm2

@lrh000
Copy link
Author

lrh000 commented Sep 7, 2023

你用的--template chatglm2

pt阶段template没用,加上不加一样,详情查看https://github.com/hiyouga/LLaMA-Efficient-Tuning/blob/0531886e1f534217dc3c9c0775d29fcf77ff7f5f/src/llmtuner/dsets/preprocess.py#L33

@hiyouga hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Sep 7, 2023
@hiyouga hiyouga closed this as completed in 5a9970d Sep 7, 2023
@hiyouga
Copy link
Owner

hiyouga commented Sep 7, 2023

已修复

@fengcai24
Copy link

请问是哪里代码出了问题呢,好像没看到修复的代码提交 @hiyouga

@musexiaoluo
Copy link

请问是哪里代码出了问题呢,好像没看到修复的代码提交 @hiyouga

修改的哪里 你找到了吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

No branches or pull requests

5 participants