Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Support for GGUF models (llama.cpp compatible) #12

Closed
syddharth opened this issue Nov 13, 2023 · 4 comments
Closed

[Feature Request] Support for GGUF models (llama.cpp compatible) #12

syddharth opened this issue Nov 13, 2023 · 4 comments
Assignees
Milestone

Comments

@syddharth
Copy link

These run on both GPU and CPU. A lot of OSS community uses them I guess, and the models are quite light on VRAM.

@davidmezzetti
Copy link
Member

Thank you for submitting.

If you're using txtai 6.2+ you can do the following.

# Embeddings index
writable: false
cloud:
  provider: huggingface-hub
  container: neuml/txtai-wikipedia

# llama.cpp pipeline
llama_cpp.Llama:
    model_path: path to GGUF file

# Extractor pipeline
extractor:
  path: llama_cpp.Llama
  output: reference

txtchat.pipeline.wikisearch.Wikisearch:
  # Add application reference
  application:

workflow:
  wikisearch:
    tasks:
    - action: txtchat.pipeline.wikisearch.Wikisearch

You just need to make sure you also have https://github.com/abetlen/llama-cpp-python installed.

@syddharth
Copy link
Author

Thanks for this. The GGUF model loads correctly. Though I am getting the following error now:

Traceback (most recent call last):
  File "C:\Users\mates\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\mates\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "c:\AI\T2T\txtchat\venv\lib\site-packages\txtchat\agent\__main__.py", line 21, in <module>
    agent = AgentFactory.create(sys.argv[1])
  File "c:\AI\T2T\txtchat\venv\lib\site-packages\txtchat\agent\factory.py", line 34, in create
    return RocketChat(config)
  File "c:\AI\T2T\txtchat\venv\lib\site-packages\txtchat\agent\rocketchat.py", line 30, in __init__
    super().__init__(config)
  File "c:\AI\T2T\txtchat\venv\lib\site-packages\txtchat\agent\base.py", line 32, in __init__
    self.application = Application(config)
  File "c:\AI\T2T\txtchat\venv\lib\site-packages\txtai\app\base.py", line 72, in __init__
    self.pipes()
  File "c:\AI\T2T\txtchat\venv\lib\site-packages\txtai\app\base.py", line 129, in pipes
    self.pipelines[pipeline] = PipelineFactory.create(config, pipeline)
  File "c:\AI\T2T\txtchat\venv\lib\site-packages\txtai\pipeline\factory.py", line 55, in create
    return pipeline if isinstance(pipeline, types.FunctionType) else pipeline(**config)
  File "c:\AI\T2T\txtchat\venv\lib\site-packages\txtchat\pipeline\wikisearch.py", line 32, in __init__
    self.workflow = Workflow([Question(action=application.pipelines["extractor"]), WikiAnswer()])
KeyError: 'extractor'

@davidmezzetti
Copy link
Member

Did you run the exact configuration provided above?

@davidmezzetti davidmezzetti self-assigned this Dec 17, 2023
@davidmezzetti davidmezzetti added this to the v0.2.0 milestone Dec 17, 2023
@davidmezzetti
Copy link
Member

Just added a fix with #13 that should fix the KeyError message you're receiving above.

If you install txtai from source, there is now direct support for llama.cpp models. See this article for more.

https://neuml.hashnode.dev/integrate-llm-frameworks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants