Doing Inference with finetuned models on custom data #1254

yopzey · 2024-08-02T18:35:13Z

yopzey
Aug 2, 2024

Im trying to run inference on my tuned model, I tuned the HF weights and not the original.

i did use custom data
dataset:
component: torchtune.datasets.text_completion_dataset

running: tune run generate --config inference.yaml prompt="What are some interesting sites to visit in the Bay Area?"

with this config

checkpointer:
  _component_: torchtune.utils.FullModelHFCheckpointer
  checkpoint_dir: /tmp/Meta-Llama-3-8B-Instruct-hf/
  checkpoint_files:
  - hf_model_0001_1.pt
  - hf_model_0002_1.pt
  - hf_model_0003_1.pt
  - hf_model_0004_1.pt
  model_type: LLAMA3
  output_dir: /tmp/Meta-Llama-3-8B-Instruct-hf/
device: cuda
dtype: bf16
enable_kv_cache: true
max_new_tokens: 300
model:
  _component_: torchtune.models.llama3.llama3_8b
prompt: What are some interesting sites to visit in the Bay Area?
quantizer: null
seed: 1234
temperature: 0.6
tokenizer:
  _component_: torchtune.models.llama3.llama3_tokenizer
  path: /tmp/Meta-Llama-3-8B-Instruct-hf/original/tokenizer.model
top_k: 300

output
DEBUG:torchtune.utils.logging:Setting manual seed to local seed 1234. Local seed is seed + rank = 1234 + 0
INFO:torchtune.utils.logging:Model is initialized with precision torch.bfloat16.
INFO:torchtune.utils.logging:What are some interesting sites to visit in the Bay Area?
INFO:torchtune.utils.logging:Time for inference: 0.65 sec total, 1.54 tokens/sec
INFO:torchtune.utils.logging:Bandwidth achieved: 31.64 GB/s
INFO:torchtune.utils.logging:Memory used: 20.62 GB

Would like help as to how i can run inference with this model

I aslo tired converting the weights using convert_hf_to_gguf.py from llama cpp

but i get

RuntimeError: Internal: could not parse ModelProto from /tmp/complete-model/tokenizer.model

I have tried replacing the tokenizer and redownloading it from hf but none seem to work.

joecummings · 2024-08-05T20:12:45Z

joecummings
Aug 5, 2024
Collaborator

Is the error the same everytime?

e.g. RuntimeError: Internal: could not parse ModelProto from /tmp/complete-model/tokenizer.model

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Doing Inference with finetuned models on custom data #1254

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Doing Inference with finetuned models on custom data #1254

yopzey Aug 2, 2024

Replies: 1 comment

joecummings Aug 5, 2024 Collaborator

yopzey
Aug 2, 2024

joecummings
Aug 5, 2024
Collaborator