-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] Support for GGUF models (llama.cpp compatible) #12
Comments
Thank you for submitting. If you're using txtai 6.2+ you can do the following. # Embeddings index
writable: false
cloud:
provider: huggingface-hub
container: neuml/txtai-wikipedia
# llama.cpp pipeline
llama_cpp.Llama:
model_path: path to GGUF file
# Extractor pipeline
extractor:
path: llama_cpp.Llama
output: reference
txtchat.pipeline.wikisearch.Wikisearch:
# Add application reference
application:
workflow:
wikisearch:
tasks:
- action: txtchat.pipeline.wikisearch.Wikisearch You just need to make sure you also have https://github.com/abetlen/llama-cpp-python installed. |
Thanks for this. The GGUF model loads correctly. Though I am getting the following error now:
|
Did you run the exact configuration provided above? |
Just added a fix with #13 that should fix the If you install txtai from source, there is now direct support for llama.cpp models. See this article for more. |
These run on both GPU and CPU. A lot of OSS community uses them I guess, and the models are quite light on VRAM.
The text was updated successfully, but these errors were encountered: