-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llama 3.1 70B Instruct would not build engine "TypeError: set_shape(): incompatible function arguments." #2018
Comments
Hi @christian-ci , could you please share the |
Hi @QiJune Here it is: {
"mlp_bias": false,
"attn_bias": false,
"rotary_base": 500000.0,
"rotary_scaling": {
"factor": 8.0,
"low_freq_factor": 1.0,
"high_freq_factor": 4.0,
"original_max_position_embeddings": 8192,
"rope_type": "llama3"
},
"residual_mlp": false,
"disable_weight_only_quant_plugin": false,
"moe": {
"num_experts": 0,
"top_k": 0,
"normalization_mode": null,
"tp_mode": 0
},
"architecture": "LlamaForCausalLM",
"dtype": "float16",
"vocab_size": 128256,
"hidden_size": 8192,
"num_hidden_layers": 80,
"num_attention_heads": 64,
"hidden_act": "silu",
"logits_dtype": "float32",
"norm_epsilon": 1e-05,
"position_embedding_type": "rope_gpt_neox",
"max_position_embeddings": 131072,
"num_key_value_heads": 8,
"intermediate_size": 28672,
"mapping": {
"world_size": 8,
"gpus_per_node": 8,
"tp_size": 8,
"pp_size": 1,
"moe_tp_size": 8,
"moe_ep_size": 1
},
"quantization": {
"quant_algo": null,
"kv_cache_quant_algo": null,
"group_size": 128,
"smoothquant_val": null,
"clamp_val": null,
"has_zero_point": false,
"pre_quant_scale": false,
"exclude_modules": null
},
"use_parallel_embedding": false,
"embedding_sharding_dim": 0,
"share_embedding_table": false,
"head_size": 128,
"qk_layernorm": false
} |
Hi @christian-ci , I can reproduce this error. You can change this line: https://github.com/NVIDIA/TensorRT-LLM/blob/main/tensorrt_llm/commands/build.py#L465 from
to
to workaround quickly. We will fix it in the internal repo, which will be updated to github in the next week. |
@QiJune Finally tested. Make sure to change that before building the image for everyone else. Thanks! |
Hey all, I'm getting the same error trying to run 3.1 8B. My command: The error:
It also seems like the file @QiJune referenced is different now, so I'm not sure where to make that fix. |
@LanceB57 You need to change line: TensorRT-LLM/tensorrt_llm/builder.py Line 729 in 93293aa
to:
and rebuild tensorrt-llm. It wont work if you change the line and nor rebuild it. |
Great, thank you! |
System Info
0.12.0.dev2024072301
5fa9436e17c2f9aeace070f49aa645d2577f676b
a6aa8eb6ce9371521df166c480e10262cd9c0cf4
535.183.01
12.2
535.183.01
12.4
Who can help?
@byshiue
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Any of the scripts below will yield the same error:
Expected behavior
It builds the engine
actual behavior
additional notes
No Additional Notes
The text was updated successfully, but these errors were encountered: