[bug] Medusa example fails with vicuna 33B #2478

SoundProvider · 2024-11-21T09:09:43Z

Thank you for developing trt-llm. It's helping me a lot
I'm trying to use medusa with trt-llm, referencing this page

It's working fine with vicuna 7B and its medusa heads, with no errors at all.

However, when implementing with vicuna 33B and its trained heads, the following error occurs when executing trtllm-build
converting checkpoint with medusa was done with following result

## running script
CUDA_VISIBLE_DEVICES=${DEVICES} \
trtllm-build --checkpoint_dir /app/medusa_test/tensorrt/${TP_SIZE}-gpu \
             --gpt_attention_plugin float16 \
             --gemm_plugin float16 \
             --context_fmha enable \
             --output_dir /app/medusa_test/tensorrt_llm/${TP_SIZE}-gpu \
             --speculative_decoding_mode medusa \
             --max_batch_size ${BATCH_SIZE} \
             --max_input_len ${SEQ_LEN} \
             --max_seq_len ${SEQ_LEN} \
             --max_num_tokens ${SEQ_LEN} \
             --workers ${TP_SIZE}

concurrent.futures.process._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/queues.py", line 244, in _feed
    obj = _ForkingPickler.dumps(obj)
  File "/usr/lib/python3.10/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
AttributeError: Can't pickle local object 'MedusaConfig.__init__.<locals>.GenericMedusaConfig'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/commands/build.py", line 437, in parallel_build
    future.result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 451, in result
    return self.__get_result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/usr/lib/python3.10/multiprocessing/queues.py", line 244, in _feed
    obj = _ForkingPickler.dumps(obj)
  File "/usr/lib/python3.10/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
AttributeError: Can't pickle local object 'MedusaConfig.__init__.<locals>.GenericMedusaConfig'

The text was updated successfully, but these errors were encountered:

hello-11 · 2024-11-22T08:50:03Z

@SoundProvider, could you also show the command to convert the checkpoint?

SoundProvider · 2024-11-22T11:30:38Z

DEVICES=0,1,2,3
TP_SIZE=4
BATCH_SIZE=4


CUDA_VISIBLE_DEVICES=${DEVICES} \
python /app/tensorrt_llm/examples/medusa/convert_checkpoint.py \
                            --model_dir /app/models/vicuna-33b-v1.3 \
                            --medusa_model_dir /app/models/medusa-vicuna-33b-v1.3 \
                            --output_dir /app/models/medusa_test/tensorrt/${TP_SIZE}-gpu \
                            --dtype float16 \
                            --num_medusa_heads 4 \
                            --tp_size ${TP_SIZE} 


CUDA_VISIBLE_DEVICES=${DEVICES} \
trtllm-build --checkpoint_dir /app/models/medusa_test/tensorrt/${TP_SIZE}-gpu \
             --gpt_attention_plugin float16 \
             --gemm_plugin float16 \
             --context_fmha enable \
             --output_dir /app/models/medusa_test/tensorrt_llm/${TP_SIZE}-gpu \
             --speculative_decoding_mode medusa \
             --max_batch_size ${BATCH_SIZE} \
             --workers ${TP_SIZE}

@hello-11 I use the medusa example here.

SoundProvider changed the title ~~Medusa example with vicuna 33B~~ Medusa example fails with vicuna 33B Nov 22, 2024

SoundProvider changed the title ~~Medusa example fails with vicuna 33B~~ [bug] Medusa example fails with vicuna 33B Nov 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug] Medusa example fails with vicuna 33B #2478

[bug] Medusa example fails with vicuna 33B #2478

SoundProvider commented Nov 21, 2024 •

edited

Loading

hello-11 commented Nov 22, 2024

SoundProvider commented Nov 22, 2024

[bug] Medusa example fails with vicuna 33B #2478

[bug] Medusa example fails with vicuna 33B #2478

Comments

SoundProvider commented Nov 21, 2024 • edited Loading

hello-11 commented Nov 22, 2024

SoundProvider commented Nov 22, 2024

SoundProvider commented Nov 21, 2024 •

edited

Loading