Vector Search
Vector search finds documents by semantic similarity. Instead of matching keywords, it compares the meaning of the query against document embeddings in vector space.
Basic Usage
Builder API
#![allow(unused)]
fn main() {
use laurus::SearchRequestBuilder;
use laurus::vector::search::searcher::VectorSearchQuery;
use laurus::vector::store::request::QueryPayload;
use laurus::data::DataValue;
let request = SearchRequestBuilder::new()
.vector_query(
VectorSearchQuery::Payloads(vec![
QueryPayload {
field: "embedding".to_string(),
payload: DataValue::Text("systems programming language".to_string()),
weight: 1.0,
},
])
)
.limit(10)
.build();
let results = engine.search(request).await?;
}
The QueryPayload stores raw data (text, bytes, etc.) that will be embedded at search time using the configured embedder.
Query DSL
#![allow(unused)]
fn main() {
use laurus::vector::VectorQueryParser;
let parser = VectorQueryParser::new(embedder.clone())
.with_default_field("embedding");
let request = parser.parse(r#"embedding:"systems programming""#).await?;
}
VectorSearchQuery
The vector search query is specified as a VectorSearchQuery enum:
| Variant | Description |
|---|---|
Payloads(Vec<QueryPayload>) | Raw payloads (text, bytes, etc.) to be embedded at search time |
Vectors(Vec<QueryVector>) | Pre-embedded query vectors ready for nearest-neighbor search |
QueryPayload
| Field | Type | Description |
|---|---|---|
field | String | Target vector field name |
payload | DataValue | The payload to embed (e.g., DataValue::Text(...)) |
weight | f32 | Score weight (default: 1.0) |
QueryVector
| Field | Type | Description |
|---|---|---|
vector | Vector | Pre-computed dense vector embedding |
weight | f32 | Score weight (default: 1.0) |
fields | Option<Vec<String>> | Optional field restriction |
Examples
#![allow(unused)]
fn main() {
use laurus::vector::search::searcher::VectorSearchQuery;
use laurus::vector::store::request::{QueryPayload, QueryVector};
use laurus::vector::core::vector::Vector;
use laurus::data::DataValue;
// Text query (will be embedded at search time)
let query = VectorSearchQuery::Payloads(vec![
QueryPayload {
field: "text_vec".to_string(),
payload: DataValue::Text("machine learning".to_string()),
weight: 1.0,
},
]);
// Pre-computed vector
let query = VectorSearchQuery::Vectors(vec![
QueryVector {
vector: Vector::from(vec![0.1, 0.2, 0.3]),
weight: 1.0,
fields: Some(vec!["embedding".to_string()]),
},
]);
}
Multi-Field Vector Search
You can search across multiple vector fields in a single request:
#![allow(unused)]
fn main() {
use laurus::vector::search::searcher::VectorSearchQuery;
use laurus::vector::store::request::QueryPayload;
use laurus::data::DataValue;
let query = VectorSearchQuery::Payloads(vec![
QueryPayload {
field: "text_vec".to_string(),
payload: DataValue::Text("cute kitten".to_string()),
weight: 1.0,
},
QueryPayload {
field: "image_vec".to_string(),
payload: DataValue::Text("fluffy cat".to_string()),
weight: 1.0,
},
]);
}
Each clause produces a vector that is searched against its respective field. Results are combined using the configured score mode.
Score Modes
| Mode | Description |
|---|---|
WeightedSum (default) | Sum of (similarity * weight) across all clauses |
MaxSim | Maximum similarity score across clauses |
LateInteraction | ColBERT-style late interaction scoring |
Weights
Use the ^ boost syntax in DSL or weight in QueryVector to adjust how much each field contributes:
text_vec:"cute kitten"^1.0 image_vec:"fluffy cat"^0.5
This means text similarity counts twice as much as image similarity.
Filtered Vector Search
You can apply lexical filters to narrow the vector search results:
#![allow(unused)]
fn main() {
use laurus::SearchRequestBuilder;
use laurus::lexical::TermQuery;
use laurus::vector::search::searcher::VectorSearchQuery;
use laurus::vector::store::request::QueryPayload;
use laurus::data::DataValue;
// Vector search with a category filter
let request = SearchRequestBuilder::new()
.vector_query(
VectorSearchQuery::Payloads(vec![
QueryPayload {
field: "embedding".to_string(),
payload: DataValue::Text("machine learning".to_string()),
weight: 1.0,
},
])
)
.filter_query(Box::new(TermQuery::new("category", "tutorial")))
.limit(10)
.build();
let results = engine.search(request).await?;
}
The filter query runs first on the lexical index to identify allowed document IDs, then the vector search is restricted to those IDs.
Filter with Numeric Range
#![allow(unused)]
fn main() {
use laurus::lexical::NumericRangeQuery;
use laurus::lexical::core::field::NumericType;
let request = SearchRequestBuilder::new()
.vector_query(
VectorSearchQuery::Payloads(vec![
QueryPayload {
field: "embedding".to_string(),
payload: DataValue::Text("type systems".to_string()),
weight: 1.0,
},
])
)
.filter_query(Box::new(NumericRangeQuery::new(
"year", NumericType::Integer,
Some(2020.0), Some(2024.0), true, true
)))
.limit(10)
.build();
}
Distance Metrics
The distance metric is configured per field in the schema (see Vector Indexing):
| Metric | Description | Lower = More Similar |
|---|---|---|
| Cosine | 1 - cosine similarity | Yes |
| Euclidean | L2 distance | Yes |
| Manhattan | L1 distance | Yes |
| DotProduct | Negative inner product | Yes |
| Angular | Angular distance | Yes |
Code Example: Complete Vector Search
use std::sync::Arc;
use laurus::{Document, Engine, Schema, SearchRequestBuilder, PerFieldEmbedder};
use laurus::lexical::TextOption;
use laurus::vector::HnswOption;
use laurus::vector::search::searcher::VectorSearchQuery;
use laurus::vector::store::request::QueryPayload;
use laurus::data::DataValue;
use laurus::storage::memory::MemoryStorage;
#[tokio::main]
async fn main() -> laurus::Result<()> {
let storage = Arc::new(MemoryStorage::new(Default::default()));
let schema = Schema::builder()
.add_text_field("title", TextOption::default())
.add_hnsw_field("text_vec", HnswOption {
dimension: 384,
..Default::default()
})
.build();
// Set up per-field embedder
let embedder = Arc::new(my_embedder);
let pfe = PerFieldEmbedder::new(embedder.clone());
pfe.add_embedder("text_vec", embedder.clone());
let engine = Engine::builder(storage, schema)
.embedder(Arc::new(pfe))
.build()
.await?;
// Index documents (text in vector field is auto-embedded)
engine.add_document("doc-1", Document::builder()
.add_text("title", "Rust Programming")
.add_text("text_vec", "Rust is a systems programming language.")
.build()
).await?;
engine.commit().await?;
// Search by semantic similarity
let results = engine.search(
SearchRequestBuilder::new()
.vector_query(
VectorSearchQuery::Payloads(vec![
QueryPayload {
field: "text_vec".to_string(),
payload: DataValue::Text("systems language".to_string()),
weight: 1.0,
},
])
)
.limit(5)
.build()
).await?;
for r in &results {
println!("{}: score={:.4}", r.id, r.score);
}
Ok(())
}
Next Steps
- Combine with keyword search: Hybrid Search
- DSL syntax for vector queries: Query DSL