Skip to content

Commit

Permalink
merged
Browse files Browse the repository at this point in the history
  • Loading branch information
marsupialtail committed Apr 25, 2024
2 parents 7cc04a4 + 55bb33e commit 493383b
Show file tree
Hide file tree
Showing 21 changed files with 720 additions and 527 deletions.
7 changes: 4 additions & 3 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ default = [] #['py']
py = ["dep:pyo3", "pyarrow", "dep:pyo3-log"]
pyarrow = ["arrow/pyarrow"]


[dependencies]
pyo3 = { version = "0.20.0", features = [
"extension-module",
Expand All @@ -23,7 +24,7 @@ pyo3-log = { version = "0.9.0", optional = true }
arrow = { version = "50.0.0", default-features = false }
tokenizers = { version = "0.15.2", features = ["http"] }
whatlang = "0.16.4"
opendal = "0"
opendal = { version = "0" }
zstd = "0.13.0" # Check for the latest version of zstd crate
serde = { version = "1.0", features = ["derive"] }
bincode = "1.3" # For serialization and deserialization
Expand Down Expand Up @@ -65,8 +66,8 @@ rand = "0.8.5"
serde_json = "1.0"
uuid = { version = "1.0", features = ["v4", "serde"] }
async-recursion = "1.0.5"
aws-config = { version = "1.1.7", features = ["behavior-version-latest"] }
aws-sdk-s3 = "1.23.0"
aws-config = { version = "1.1.7", features = ["behavior-version-latest"]}
aws-sdk-s3 = { version = "1.23.0" }
bitvector = "0.1.5"
ndarray = { version = "0.15.6", features = ["rayon", "serde"] }
numpy = "0.20.0"
Expand Down
Binary file added LogCloud.pdf
Binary file not shown.
8 changes: 6 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Rottnest : Data Lake Indices

You don't need ElasticSearch or some vector database to do full text search or vector search. Parquet + Rottnest is all you need. Rottnest is like Postgres indices for Parquet.
You don't need ElasticSearch or some vector database to do full text search or vector search. Parquet + Rottnest is all you need. Rottnest is like Postgres indices for Parquet. Read more on what it can do for e.g. logs [here](LogCloud.pdf).

## Installation

Expand Down Expand Up @@ -46,5 +46,9 @@ Rottnest not only supports BM25 indices but also other indices, like regex and v

### Build Python wheel
```bash
maturin develop --features py
maturin develop --features "py,opendal"
```
or
```bash
maturin develop --features "py,aws_sdk"
```
226 changes: 0 additions & 226 deletions src/formats/io.rs

This file was deleted.

3 changes: 2 additions & 1 deletion src/formats/mod.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
pub mod readers;

pub mod parquet;
pub mod io;

pub use parquet::get_parquet_layout;
pub use parquet::read_indexed_pages;
Expand Down
Loading

0 comments on commit 493383b

Please sign in to comment.