Dynamically choose default distance type based on vector type #3398

wjones127 · 2025-01-20T19:00:11Z

The default distance type is L2, which works for float-based vectors. But for binary vectors, this results in hard-to-understand error:

lance error:
  LanceError(IO):
    Execution error:
      LanceError(Arrow):
        Compute error: Unsupported data type: UInt8,
          /src/lance/rust/lance-index/src/vector/flat.rs:65:17,
       /src/lance/rust/lance/src/dataset/scanner.rs:2324:83

We should instead dynamically choose the default based on the data type. Float-based vectors can use L2. Binary vectors should use hamming.

Also, we should fix the error messsage: If given a wrong distance type, we should return Error::InvalidInput and the message should say that the distance function doesn't support the data type. Note that the current message doesn't say which distance function has been selected.

The text was updated successfully, but these errors were encountered:

wjones127 added the enhancement New feature or request label Jan 20, 2025

wjones127 added this to the Lance Papercuts milestone Jan 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamically choose default distance type based on vector type #3398

Dynamically choose default distance type based on vector type #3398

wjones127 commented Jan 20, 2025 •

edited

Loading

Dynamically choose default distance type based on vector type #3398

Dynamically choose default distance type based on vector type #3398

Comments

wjones127 commented Jan 20, 2025 • edited Loading

wjones127 commented Jan 20, 2025 •

edited

Loading