Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Vector Search

Vector search finds documents by semantic similarity. Instead of matching keywords, it compares the meaning of the query against document embeddings in vector space.

Basic Usage

Builder API

#![allow(unused)]
fn main() {
use laurus::SearchRequestBuilder;
use laurus::vector::search::searcher::VectorSearchQuery;
use laurus::vector::store::request::QueryPayload;
use laurus::data::DataValue;

let request = SearchRequestBuilder::new()
    .vector_query(
        VectorSearchQuery::Payloads(vec![
            QueryPayload {
                field: "embedding".to_string(),
                payload: DataValue::Text("systems programming language".to_string()),
                weight: 1.0,
            },
        ])
    )
    .limit(10)
    .build();

let results = engine.search(request).await?;
}

The QueryPayload stores raw data (text, bytes, etc.) that will be embedded at search time using the configured embedder.

Query DSL

#![allow(unused)]
fn main() {
use laurus::vector::VectorQueryParser;

let parser = VectorQueryParser::new(embedder.clone())
    .with_default_field("embedding");

let request = parser.parse(r#"embedding:"systems programming""#).await?;
}

VectorSearchQuery

The vector search query is specified as a VectorSearchQuery enum:

VariantDescription
Payloads(Vec<QueryPayload>)Raw payloads (text, bytes, etc.) to be embedded at search time
Vectors(Vec<QueryVector>)Pre-embedded query vectors ready for nearest-neighbor search

QueryPayload

FieldTypeDescription
fieldStringTarget vector field name
payloadDataValueThe payload to embed (e.g., DataValue::Text(...))
weightf32Score weight (default: 1.0)

QueryVector

FieldTypeDescription
vectorVectorPre-computed dense vector embedding
weightf32Score weight (default: 1.0)
fieldsOption<Vec<String>>Optional field restriction

Examples

#![allow(unused)]
fn main() {
use laurus::vector::search::searcher::VectorSearchQuery;
use laurus::vector::store::request::{QueryPayload, QueryVector};
use laurus::vector::core::vector::Vector;
use laurus::data::DataValue;

// Text query (will be embedded at search time)
let query = VectorSearchQuery::Payloads(vec![
    QueryPayload {
        field: "text_vec".to_string(),
        payload: DataValue::Text("machine learning".to_string()),
        weight: 1.0,
    },
]);

// Pre-computed vector
let query = VectorSearchQuery::Vectors(vec![
    QueryVector {
        vector: Vector::from(vec![0.1, 0.2, 0.3]),
        weight: 1.0,
        fields: Some(vec!["embedding".to_string()]),
    },
]);
}

You can search across multiple vector fields in a single request:

#![allow(unused)]
fn main() {
use laurus::vector::search::searcher::VectorSearchQuery;
use laurus::vector::store::request::QueryPayload;
use laurus::data::DataValue;

let query = VectorSearchQuery::Payloads(vec![
    QueryPayload {
        field: "text_vec".to_string(),
        payload: DataValue::Text("cute kitten".to_string()),
        weight: 1.0,
    },
    QueryPayload {
        field: "image_vec".to_string(),
        payload: DataValue::Text("fluffy cat".to_string()),
        weight: 1.0,
    },
]);
}

Each clause produces a vector that is searched against its respective field. Results are combined using the configured score mode.

Score Modes

ModeDescription
WeightedSum (default)Sum of (similarity * weight) across all clauses
MaxSimMaximum similarity score across clauses
LateInteractionColBERT-style late interaction scoring

Weights

Use the ^ boost syntax in DSL or weight in QueryVector to adjust how much each field contributes:

text_vec:"cute kitten"^1.0 image_vec:"fluffy cat"^0.5

This means text similarity counts twice as much as image similarity.

You can apply lexical filters to narrow the vector search results:

#![allow(unused)]
fn main() {
use laurus::SearchRequestBuilder;
use laurus::lexical::TermQuery;
use laurus::vector::search::searcher::VectorSearchQuery;
use laurus::vector::store::request::QueryPayload;
use laurus::data::DataValue;

// Vector search with a category filter
let request = SearchRequestBuilder::new()
    .vector_query(
        VectorSearchQuery::Payloads(vec![
            QueryPayload {
                field: "embedding".to_string(),
                payload: DataValue::Text("machine learning".to_string()),
                weight: 1.0,
            },
        ])
    )
    .filter_query(Box::new(TermQuery::new("category", "tutorial")))
    .limit(10)
    .build();

let results = engine.search(request).await?;
}

The filter query runs first on the lexical index to identify allowed document IDs, then the vector search is restricted to those IDs.

Filter with Numeric Range

#![allow(unused)]
fn main() {
use laurus::lexical::NumericRangeQuery;
use laurus::lexical::core::field::NumericType;

let request = SearchRequestBuilder::new()
    .vector_query(
        VectorSearchQuery::Payloads(vec![
            QueryPayload {
                field: "embedding".to_string(),
                payload: DataValue::Text("type systems".to_string()),
                weight: 1.0,
            },
        ])
    )
    .filter_query(Box::new(NumericRangeQuery::new(
        "year", NumericType::Integer,
        Some(2020.0), Some(2024.0), true, true
    )))
    .limit(10)
    .build();
}

Distance Metrics

The distance metric is configured per field in the schema (see Vector Indexing):

MetricDescriptionLower = More Similar
Cosine1 - cosine similarityYes
EuclideanL2 distanceYes
ManhattanL1 distanceYes
DotProductNegative inner productYes
AngularAngular distanceYes
use std::sync::Arc;
use laurus::{Document, Engine, Schema, SearchRequestBuilder, PerFieldEmbedder};
use laurus::lexical::TextOption;
use laurus::vector::HnswOption;
use laurus::vector::search::searcher::VectorSearchQuery;
use laurus::vector::store::request::QueryPayload;
use laurus::data::DataValue;
use laurus::storage::memory::MemoryStorage;

#[tokio::main]
async fn main() -> laurus::Result<()> {
    let storage = Arc::new(MemoryStorage::new(Default::default()));

    let schema = Schema::builder()
        .add_text_field("title", TextOption::default())
        .add_hnsw_field("text_vec", HnswOption {
            dimension: 384,
            ..Default::default()
        })
        .build();

    // Set up per-field embedder
    let embedder = Arc::new(my_embedder);
    let pfe = PerFieldEmbedder::new(embedder.clone());
    pfe.add_embedder("text_vec", embedder.clone());

    let engine = Engine::builder(storage, schema)
        .embedder(Arc::new(pfe))
        .build()
        .await?;

    // Index documents (text in vector field is auto-embedded)
    engine.add_document("doc-1", Document::builder()
        .add_text("title", "Rust Programming")
        .add_text("text_vec", "Rust is a systems programming language.")
        .build()
    ).await?;
    engine.commit().await?;

    // Search by semantic similarity
    let results = engine.search(
        SearchRequestBuilder::new()
            .vector_query(
                VectorSearchQuery::Payloads(vec![
                    QueryPayload {
                        field: "text_vec".to_string(),
                        payload: DataValue::Text("systems language".to_string()),
                        weight: 1.0,
                    },
                ])
            )
            .limit(5)
            .build()
    ).await?;

    for r in &results {
        println!("{}: score={:.4}", r.id, r.score);
    }

    Ok(())
}

Next Steps