Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Qdrant - Add group_by and group_size optional parameters to Retrievers #1054

Merged
merged 2 commits into from
Sep 12, 2024

Conversation

lambda-science
Copy link
Contributor

Related Issues

Proposed Changes:

Adding the group_by and group_size parameters to Qdrant Retrievers.
See: https://qdrant.tech/documentation/concepts/hybrid-queries/#grouping and https://api.qdrant.tech/master/api-reference/search/query-points-groups
This allow to specify a maximum number of chunks to retrieve for a specific metadata value. Useful to force the Retriever to retriever for many documents instead of focusing on one.

How did you test it?

Added new test for each retriever as well as fixing previous test like converting to dict.

Notes for the reviewer

The test are flaky sometimes in the number of documents retriever.
Technically it can never be more documents than top_k * group_size, but sometimes it's less. (I guess it can be less than top_k)
In the context of group_by search, the top_k represent the maximum number of groups, doc have been updated to reflect this.

This code is not optimal with the big IF/ELIF, but I didn't wanted to implement yet another Retriever to this integration.

Checklist

@lambda-science lambda-science requested a review from a team as a code owner September 6, 2024 11:41
@lambda-science lambda-science requested review from silvanocerza and removed request for a team September 6, 2024 11:41
@github-actions github-actions bot added integration:qdrant type:documentation Improvements or additions to documentation labels Sep 6, 2024
@lambda-science
Copy link
Contributor Author

@Anush008 could be reviewer also, maybe ?

Copy link
Contributor

@silvanocerza silvanocerza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplified a bit the ifs, the rest is good. 👍

@silvanocerza silvanocerza merged commit 8bac5de into deepset-ai:main Sep 12, 2024
10 checks passed
Amnah199 pushed a commit that referenced this pull request Oct 2, 2024
…rievers (#1054)

* Qdrant: Add group_by and group_size optional parameters to Retrievers

* Simplify ifs

---------

Co-authored-by: Silvano Cerza <[email protected]>
@lambda-science lambda-science deleted the feat/qdrant_group_by branch January 15, 2025 12:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integration:qdrant type:documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Qdrant: how to make a "group_by" search
2 participants