Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

redisvector: fix score threshold option #1003

Merged
merged 1 commit into from
Sep 13, 2024

Conversation

acrmp
Copy link
Contributor

@acrmp acrmp commented Sep 5, 2024

  • Redis vector range queries expect a radius rather than a score threshold.
  • The existing code passed the score threshold through directly as the radius param. This means that a higher score threshold will result in in more matches than a lower score threshold which seems wrong. Instead this PR subtracts the score threshold from 1.
  • The langchain project appears to have deprecated score threshold in favour of distance threshold. A future improvement would be to add a WithDistanceThreshold option for consistency.
  • This PR is a breaking change for clients using redisvector and WithScoreThreshold. An alternative would be to keep the existing behaviour but add WithDistanceThreshold and guide users towards it.

PR Checklist

  • Read the Contributing documentation.
  • Read the Code of conduct documentation.
  • Name your Pull Request title clearly, concisely, and prefixed with the name of the primarily affected package you changed according to Good commit messages (such as memory: add interfaces for X, Y or util: add whizzbang helpers).
  • Check that there isn't already a PR that solves the problem the same way to avoid creating a duplicate.
  • Provide a description in this PR that addresses what the PR is solving, or reference the issue that it solves (e.g. Fixes #123).
  • Describes the source of new concepts.
  • References existing implementations as appropriate.
  • Contains test coverage for new functions.
  • Passes all golangci-lint checks. - There are some existing complaints about magic numbers not introduced by this PR.

Vector range queries expect a radius rather than a score threshold.
@tmc
Copy link
Owner

tmc commented Sep 13, 2024

Bit on the fence about a breaking change but this seems more appropriate, in general.

@tmc
Copy link
Owner

tmc commented Sep 13, 2024

Could you tackle the WithDistanceThreshold, that sounds like a nice option.

Copy link
Owner

@tmc tmc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tmc tmc merged commit 47d2d99 into tmc:main Sep 13, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants