-
Notifications
You must be signed in to change notification settings - Fork 10.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom fine-tuned DeepSeek coder model unable to be quantized to Fp16 #5234
Comments
Reads like a broken tokenizer file ? |
Thanks for your response - however, where do I find the vocab file in that huggingface? I assume you meant the vocab.json file? |
the tokenizer and vocab files, I'm not sure which ones are used. |
the files are not broken. This is an issue for other people as well. In fact, you dont have to quantize a custom deepseek model to get this error. If you just quantize the original 7b model, it will throw up this error too. |
Same story with latest set of DeepSeek Math Models. python convert.py deepseek-math-7b-rl --vocab-type hfft --pad-vocab python convert.py deepseek-math-7b-rl --vocab-type bpe --pad-vocab |
Any insights @jackshiwl ? |
Can confirm this issue.. although it converts the model using
|
hi all, i am not investigating this issue anymore. I am using another model. Hope someone can fix this / look into this @cmp-nct |
It seems there was a change recently that pins bpe to vocab.json . From the HF docs it looks like any compatible PretrainedTokenizer transformers supports could be represented by tokenizer.json https://huggingface.co/docs/transformers/en/fast_tokenizers 3 weeks ago, b2213 convert.py output
current mainline convert.py output
result latest running main:
3 weeks ago running main:
we still have our mismatched but the type is bpe rather than spm it also produces text as expected, no garbage, rather than segfault edit: I had another moment so I tried just copying tokenizer.json to vocab.json and setting vocab-type to bpe.
I confirmed both b2213 and the current main's convert.py if you do the above generate an f32 with an idental sha256 hash. |
There's a PR from the deepseek team about this. Basically, their tokenizer needs to be supported in llama.cpp for this to work. |
@Nold360 yeah, I got the same error, did you have any way to solve it ? thanks. It can not quantize with |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
Hi,
I am trying to quantize my custom fine-tuned deepseek-7b instruct model, and I am unable to to do. I followed the document:
but it produces this error:
I cannot seem to find similar errors on the github issues. Any insight to this would be greatly appreciated.
One can replicate this experiment by quantizing a deepseek 7b instruct coder model.
The text was updated successfully, but these errors were encountered: