Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamically choose default distance type based on vector type #3398

Open
wjones127 opened this issue Jan 20, 2025 · 0 comments
Open

Dynamically choose default distance type based on vector type #3398

wjones127 opened this issue Jan 20, 2025 · 0 comments
Labels
enhancement New feature or request

Comments

@wjones127
Copy link
Contributor

wjones127 commented Jan 20, 2025

The default distance type is L2, which works for float-based vectors. But for binary vectors, this results in hard-to-understand error:

lance error:
  LanceError(IO):
    Execution error:
      LanceError(Arrow):
        Compute error: Unsupported data type: UInt8,
          /src/lance/rust/lance-index/src/vector/flat.rs:65:17,
       /src/lance/rust/lance/src/dataset/scanner.rs:2324:83

We should instead dynamically choose the default based on the data type. Float-based vectors can use L2. Binary vectors should use hamming.

Also, we should fix the error messsage: If given a wrong distance type, we should return Error::InvalidInput and the message should say that the distance function doesn't support the data type. Note that the current message doesn't say which distance function has been selected.

@wjones127 wjones127 added the enhancement New feature or request label Jan 20, 2025
@wjones127 wjones127 added this to the Lance Papercuts milestone Jan 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant