Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memgpt load directory with hugging-face embeddings fails to parse embed response #723

Closed
jimlloyd opened this issue Dec 27, 2023 · 4 comments · Fixed by #725
Closed

memgpt load directory with hugging-face embeddings fails to parse embed response #723

jimlloyd opened this issue Dec 27, 2023 · 4 comments · Fixed by #725

Comments

@jimlloyd
Copy link
Contributor

Describe the bug
This is the issue for the problem first reported in Discord https://discordapp.com/channels/1161736243340640419/1162177332350558339/1189677007915720754

I ran this command:

memgpt load directory --name test_load --input-dir /Users/jim.lloyd/dev/{project}/cpp/{component}--recursive

Where {component}is a relatively small component in the large {project}.

The error is:

Traceback (most recent call last):
  File "/Users/jim.lloyd/.pyenv/versions/3.11.7/bin/memgpt", line 8, in <module>
    sys.exit(app())
             ^^^^^
  File "/Users/jim.lloyd/.pyenv/versions/3.11.7/lib/python3.11/site-packages/typer/main.py", line 328, in __call__
    raise e
  File "/Users/jim.lloyd/.pyenv/versions/3.11.7/lib/python3.11/site-packages/typer/main.py", line 311, in __call__
    return get_command(self)(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jim.lloyd/.pyenv/versions/3.11.7/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jim.lloyd/.pyenv/versions/3.11.7/lib/python3.11/site-packages/typer/core.py", line 778, in main
    return _main(
           ^^^^^^
  File "/Users/jim.lloyd/.pyenv/versions/3.11.7/lib/python3.11/site-packages/typer/core.py", line 216, in _main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/Users/jim.lloyd/.pyenv/versions/3.11.7/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jim.lloyd/.pyenv/versions/3.11.7/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jim.lloyd/.pyenv/versions/3.11.7/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jim.lloyd/.pyenv/versions/3.11.7/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jim.lloyd/.pyenv/versions/3.11.7/lib/python3.11/site-packages/typer/main.py", line 683, in wrapper
    return callback(**use_params)  # type: ignore
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jim.lloyd/.pyenv/versions/3.11.7/lib/python3.11/site-packages/memgpt/cli/cli_load.py", line 106, in load_directory
    store_docs(name, docs)
  File "/Users/jim.lloyd/.pyenv/versions/3.11.7/lib/python3.11/site-packages/memgpt/cli/cli_load.py", line 48, in store_docs
    len(node.embedding) == config.embedding_dim
AssertionError: Expected embedding dimension 1024, got 4: {'object': 'list', 'data': [{'object': 'embedding', 'embedding': [-0.009444456, 0.027610583, -0.0014550734, 0.021434475, ...

I am running text-embeddings-router locally and observed that it served about 50 requests with log messages looking like:

2023-12-27T21:01:39.910312Z  INFO openai_embed{total_time="559.591458ms" tokenization_time="1.89375ms" queue_time="232.208µs" inference_time="557.183ms"}: text_embeddings_router::http::server: router/src/http/server.rs:697: Success
2023-12-27T21:01:40.396546Z  INFO openai_embed{total_time="470.952459ms" tokenization_time="575.875µs" queue_time="134.166µs" inference_time="470.176459ms"}: text_embeddings_router::http::server: router/src/http/server.rs:697: Success
2023-12-27T21:01:40.859433Z  INFO openai_embed{total_time="446.745416ms" tokenization_time="553.417µs" queue_time="118.459µs" inference_time="446.01825ms"}: text_embeddings_router::http::server: router/src/http/server.rs:697: Success
...

It seems to me that memgpt had loaded the documents, split them into about 50 nodes, ran the embeddings requests on the text for all 50, and then began iterating over the responses and failed on the first iteration. The failure is because memgpt was expecting a simple array of 1024 floats, but instead had received an JSON object with multiple properties. The embeddings vector was contained in the object but in an unexpected location.

Please describe your setup

I have over several days been hacking towards using memgpt as a coding assistant with a large C++ codebase. I probably cannot given a concise & accurate list a commands to how I got here but after filing this I will try to reproduce the problem from scratch. In the meantime, here is what I can provide for my current setup:

Powerbook M2 64Gb RAM
Sonoma 14.2
MemGPT version: 0.2.10
which memgpt: /Users/jim.lloyd/.pyenv/shims/memgpt

jim.lloyd@jimsm2 ~ % pyenv version
3.11.7 (set by PYENV_VERSION environment variable)

I'm pretty sure I installed all dependencies this in this 3.11.7 environment this way:
python -m pip install memgpt
And then repeated for any other requirements

I run memgpt via iTerm or VS Code shells.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.


If you're not using OpenAI, please provide additional information on your local LLM setup:

Local LLM details
Currently using llama.cpp. I cloned the llama.cpp repo, built it with make -j 8, and then run it with:

./server -m models/openhermes-2.5-mistral-7b-16k.Q8_0.gguf -c 16384
@jimlloyd
Copy link
Contributor Author

My memgpt config file:

[defaults]
preset = memgpt_chat
persona = memgpt_doc
human = basic

[model]
model_endpoint = http://localhost:8080
model_endpoint_type = llamacpp
model_wrapper = chatml-hints
context_window = 16384

[openai]
key = {omitted}

[embedding]
embedding_endpoint_type = hugging-face
embedding_endpoint = http://localhost:3000
embedding_model = BAAI/bge-large-en-v1.5
embedding_dim = 1024
embedding_chunk_size = 300

[archival_storage]
type = postgres
uri = postgresql://jim.lloyd@localhost:5432/memgpt2

[version]
memgpt_version = 0.2.10

[client]
anon_clientid = {omitted}

@jimlloyd
Copy link
Contributor Author

I started from scratch and reproduced the error.

I did run the quickstart and it seemed to run without problems.

I then did the memgpt configure and then memgpt run and discovered several mistakes in my setup:

  1. I installed pymemgpt but I needed to install pymemgpt[local]. After realizing this I just added that, hoping that the resulting installation would be correct.
  2. I discovered a couple more things I had to install separately: pgvector and psycopg2.
  3. I then discovered I had a typo in the URL for my postgres DB.

After fixing this problems it seemed like I could run memgpt and carry out a dialog. (Though I had a typo when I tried to give it my correct name and wrote my hame is jim. This really confused memgpt. It made several attempts to correct my name and eventually gave up.

Finally, I ran the same memgpt load directory ... command as above and got the same error.

@cpacker
Copy link
Collaborator

cpacker commented Dec 28, 2023

Seems like this is an issue with TEI embeddings, looking into it.

Works fine with OpenAI:

(pymemgpt-py3.10) (base) loaner@MacBook-Pro-5 MemGPT-2 % memgpt load directory --name test_load --input-dir memgpt/personas/examples 
--recursive
LLM is explicitly disabled. Using MockLLM.
LLM is explicitly disabled. Using MockLLM.
Parsing nodes: 100%|███████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 30.85it/s]
Generating embeddings: 100%|███████████████████████████████████████████████████████████████████████| 100/100 [00:03<00:00, 31.43it/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 609637.21it/s]
Generating embeddings: 0it [00:00, ?it/s]

Doesn't work with memgpt quickstart:

(pymemgpt-py3.10) (base) loaner@MacBook-Pro-5 MemGPT-2 % memgpt quickstart
📖 MemGPT configuration file updated!
🧠 model        -> ehartford/dolphin-2.5-mixtral-8x7b
🖥️  endpoint     -> https://api.memgpt.ai
⚡ Run "memgpt run" to create an agent with the new config.
(pymemgpt-py3.10) (base) loaner@MacBook-Pro-5 MemGPT-2 % memgpt load directory --name test_load2 --input-dir memgpt/personas/examples --recursive
LLM is explicitly disabled. Using MockLLM.
LLM is explicitly disabled. Using MockLLM.
Parsing nodes: 100%|███████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 50.50it/s]
Generating embeddings: 100%|███████████████████████████████████████████████████████████████████████| 100/100 [00:10<00:00,  9.56it/s]
  0%|                                                                                                        | 0/100 [00:00<?, ?it/s]
    return callback(**use_params)  # type: ignore
  File "/Users/loaner/dev/MemGPT-2/memgpt/cli/cli_load.py", line 106, in load_directory
    store_docs(name, docs)
  File "/Users/loaner/dev/MemGPT-2/memgpt/cli/cli_load.py", line 48, in store_docs
    len(node.embedding) == config.embedding_dim
AssertionError: Expected embedding dimension 1536, got 4: {'object': 'list', 'data': [{'object': 'embedding', 'embedding': [0.0072218 ...

@tbui-isgn
Copy link

This is looks like the issue I raised a few weeks ago (#587). Would be nice to see it fixed.

mattzh72 pushed a commit that referenced this issue Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants