pgvector support for Ruby
For Rails, check out Neighbor
Add this line to your application’s Gemfile:
gem "pgvector"
And follow the instructions for your database library:
Or check out some examples:
- Embeddings with OpenAI
- Binary embeddings with Cohere
- Sentence embeddings with Informers
- Hybrid search with Informers (Reciprocal Rank Fusion)
- Sparse search with Transformers.rb
- Morgan fingerprints with RDKit.rb
- Topic modeling with tomoto.rb
- User-based recommendations with Disco
- Item-based recommendations with Disco
- Horizontal scaling with Citus
- Bulk loading with
COPY
Enable the extension
conn.exec("CREATE EXTENSION IF NOT EXISTS vector")
Optionally enable type casting for results
registry = PG::BasicTypeRegistry.new.define_default_types
Pgvector::PG.register_vector(registry)
conn.type_map_for_results = PG::BasicTypeMapForResults.new(conn, registry: registry)
Create a table
conn.exec("CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))")
Insert a vector
embedding = [1, 2, 3]
conn.exec_params("INSERT INTO items (embedding) VALUES ($1)", [embedding])
Get the nearest neighbors to a vector
conn.exec_params("SELECT * FROM items ORDER BY embedding <-> $1 LIMIT 5", [embedding]).to_a
Add an approximate index
conn.exec("CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)")
# or
conn.exec("CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)")
Use vector_ip_ops
for inner product and vector_cosine_ops
for cosine distance
Enable the extension
DB.run("CREATE EXTENSION IF NOT EXISTS vector")
Create a table
DB.create_table :items do
primary_key :id
column :embedding, "vector(3)"
end
Add the plugin to your model
class Item < Sequel::Model
plugin :pgvector, :embedding
end
Insert a vector
Item.create(embedding: [1, 1, 1])
Get the nearest neighbors to a record
item.nearest_neighbors(:embedding, distance: "euclidean").limit(5)
Also supports inner_product
, cosine
, taxicab
, hamming
, and jaccard
distance
Get the nearest neighbors to a vector
Item.nearest_neighbors(:embedding, [1, 1, 1], distance: "euclidean").limit(5)
Add an approximate index
DB.add_index :items, :embedding, type: "hnsw", opclass: "vector_l2_ops"
Use vector_ip_ops
for inner product and vector_cosine_ops
for cosine distance
View the changelog
Everyone is encouraged to help improve this project. Here are a few ways you can help:
- Report bugs
- Fix bugs and submit pull requests
- Write, clarify, or fix documentation
- Suggest or add new features
To get started with development:
git clone https://github.com/pgvector/pgvector-ruby.git
cd pgvector-ruby
createdb pgvector_ruby_test
bundle install
bundle exec rake test
To run an example:
cd examples/loading
bundle install
createdb pgvector_example
bundle exec ruby example.rb