-
-
Notifications
You must be signed in to change notification settings - Fork 632
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vectorstores: fix pgvector issues and add more test #617
Conversation
bf89d0c
to
22159ce
Compare
ef4f441
to
59a754c
Compare
.github/workflows/ci.yaml
Outdated
@@ -36,12 +36,31 @@ jobs: | |||
run: make build-examples | |||
build-test: | |||
runs-on: ubuntu-latest | |||
services: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given we now have testcontainers-go for that (see #519), I'd encourage to use the existing module for Postgres + pgvector.
I offer myself to assist in it. It will improve the dev experience, as we'll be able to run the very same tests the CI runs, but locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. pls take a look again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! In only 15s the code runs 10 containers! 👏👏👏
As a follow up, one container could be used, and each test would use it's own snapshot: https://golang.testcontainers.org/modules/postgres/#using-snapshots
5ee8944
to
2861cd0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@Abirdcfly I had a branch with some similar changes regarding pgvector: #648 If this PR gets into first, I'm happy to rebase mine, and if the other one gets into first, I'm happy to help here 🙋 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going head and merging #648 so this should be rebased given that.
1. add pgvector into test 2. add OPENAI_API_KEY and GENAI_API_KEY into test 3. deprecate pgvector table name Sanitize function 4. reset pgvector Search sql and make TestDeduplicater rerun 5. add test TestWithAllOptions for test all option 6. because of StuffDocuments.joinDocuments ignore document's metadata, update some tests Signed-off-by: Abirdcfly <[email protected]>
rebase done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, great improvements.
OPENAI_API_KEY
andGENAI_API_KEY
into test, like langchain-python do (But I don't know how langchain generates the openai key in the test, is it the maintainer's personal account token or is it dependent on the sponsorship...)Sanitize
function, Fix pgvectorWithEmbeddingTableName
sanitization conflicts with index creation #605Search
sql and make TestDeduplicater rerunTestWithAllOptions
for test all optionsStuffDocuments.joinDocuments
just ignore document's metadata, update some teststhen go test