Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose creating and querying index for a ReDap collection in python SDK #8890

Merged
merged 8 commits into from
Feb 3, 2025

Conversation

zehiko
Copy link
Contributor

@zehiko zehiko commented Feb 3, 2025

What

Basic plumbing that enables creating and querying vector and FTS index. Joint effort from @jleibs and me.

Testing done

Ran the latest Indexing notebook successfully.

@zehiko zehiko added exclude from changelog PRs with this won't show up in CHANGELOG.md remote-store remote store gRPC API labels Feb 3, 2025
@zehiko zehiko self-assigned this Feb 3, 2025
Copy link

github-actions bot commented Feb 3, 2025

Web viewer built successfully. If applicable, you should also test it:

  • I have tested the web viewer
Result Commit Link Manifest
88ee77c https://rerun.io/viewer/pr/8890 +nightly +main

Note: This comment is updated whenever you push a commit.

@@ -68,3 +72,8 @@
schema = conn.get_recording_schema(id)
for column in schema:
print(column)
elif args.subcommand == "create-index":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
elif args.subcommand == "create-index":
elif args.subcommand == "create-vector-index":

@@ -68,3 +72,8 @@
schema = conn.get_recording_schema(id)
for column in schema:
print(column)
elif args.subcommand == "create-index":
column = rr.dataframe.ComponentColumnSelector(args.entity_path, args.index_column)
index = rr.dataframe.IndexColumnSelector("log_tick")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might as well make this and the collection name configurable as well.

column,
top_k,
))]
fn query_vector_index(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm realizing here and elsewhere that query isn't really the right term. Propose we swap this with "search":
https://github.com/rerun-io/dataplatform/issues/185

@jleibs jleibs merged commit 4f1f44b into main Feb 3, 2025
32 checks passed
@jleibs jleibs deleted the zehiko/py-indexing branch February 3, 2025 21:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exclude from changelog PRs with this won't show up in CHANGELOG.md remote-store remote store gRPC API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants