Vector Indexing

Vector indexing powers similarity-based search. When a document’s vector field is indexed, Laurus stores the embedding vector in a specialized index structure that enables fast approximate nearest neighbor (ANN) retrieval.

How Vector Indexing Works

sequenceDiagram
    participant Doc as Document
    participant Embedder
    participant Normalize as Normalizer
    participant Index as Vector Index

    Doc->>Embedder: "Rust is a systems language"
    Embedder-->>Normalize: [0.12, -0.45, 0.78, ...]
    Normalize->>Normalize: L2 normalize
    Normalize-->>Index: [0.14, -0.52, 0.90, ...]
    Index->>Index: Insert into index structure

Step by Step

Embed: The text (or image) is converted to a vector by the configured embedder
Normalize: The vector is L2-normalized (for cosine similarity)
Index: The vector is inserted into the configured index structure (Flat, HNSW, or IVF)
Commit: On commit(), the index is flushed to persistent storage

Index Types

Laurus supports three vector index types, each with different performance characteristics:

Comparison

Property	Flat	HNSW	IVF
Accuracy	100% (exact)	~95-99% (approximate)	~90-98% (approximate)
Search speed	O(n) linear scan	O(log n) graph walk	O(n/k) cluster scan
Memory usage	Low	Higher (graph edges)	Moderate (centroids)
Index build time	Fast	Moderate	Slower (clustering)
Best for	< 10K vectors	10K - 10M vectors	> 1M vectors

Flat Index

The simplest index. Compares the query vector against every stored vector (brute-force).

#![allow(unused)]
fn main() {
use laurus::vector::FlatOption;
use laurus::vector::core::distance::DistanceMetric;

let opt = FlatOption {
    dimension: 384,
    distance: DistanceMetric::Cosine,
    ..Default::default()
};
}

Pros: 100% recall (exact results), simple, low memory
Cons: Slow for large datasets (linear scan)
Use when: You have fewer than ~10,000 vectors, or you need exact results

HNSW Index

Hierarchical Navigable Small World graph. The default and most commonly used index type.

graph TB
    subgraph "Layer 2 (sparse)"
        A2["A"] --- C2["C"]
    end

    subgraph "Layer 1 (medium)"
        A1["A"] --- B1["B"]
        A1 --- C1["C"]
        B1 --- D1["D"]
        C1 --- D1
    end

    subgraph "Layer 0 (dense - all vectors)"
        A0["A"] --- B0["B"]
        A0 --- C0["C"]
        B0 --- D0["D"]
        B0 --- E0["E"]
        C0 --- D0
        C0 --- F0["F"]
        D0 --- E0
        E0 --- F0
    end

    A2 -.->|"entry point"| A1
    A1 -.-> A0
    C2 -.-> C1
    C1 -.-> C0
    B1 -.-> B0
    D1 -.-> D0

The HNSW algorithm searches from the top (sparse) layer down to the bottom (dense) layer, narrowing the search space at each level.

#![allow(unused)]
fn main() {
use laurus::vector::HnswOption;
use laurus::vector::core::distance::DistanceMetric;

let opt = HnswOption {
    dimension: 384,
    distance: DistanceMetric::Cosine,
    m: 16,                  // max connections per node per layer
    ef_construction: 200,   // search width during index building
    ..Default::default()
};
}

HNSW Parameters

Parameter	Default	Description	Impact
`m`	16	Max bi-directional connections per layer	Higher = better recall, more memory
`ef_construction`	200	Search width during index building	Higher = better recall, slower build
`dimension`	128	Vector dimensions	Must match embedder output
`distance`	Cosine	Distance metric	See Distance Metrics below

Tuning tips:

Increase m (e.g., 32 or 64) for higher recall at the cost of memory
Increase ef_construction (e.g., 400) for better index quality at the cost of build time
At search time, the ef_search parameter (set in the search request) controls the search width

IVF Index

Inverted File Index. Partitions vectors into clusters, then only searches relevant clusters.

graph TB
    Q["Query Vector"]
    Q --> C1["Cluster 1\n(centroid)"]
    Q --> C2["Cluster 2\n(centroid)"]

    C1 --> V1["vec_3"]
    C1 --> V2["vec_7"]
    C1 --> V3["vec_12"]

    C2 --> V4["vec_1"]
    C2 --> V5["vec_9"]
    C2 --> V6["vec_15"]

    style C1 fill:#f9f,stroke:#333
    style C2 fill:#f9f,stroke:#333

#![allow(unused)]
fn main() {
use laurus::vector::IvfOption;
use laurus::vector::core::distance::DistanceMetric;

let opt = IvfOption {
    dimension: 384,
    distance: DistanceMetric::Cosine,
    n_clusters: 100,   // number of clusters
    n_probe: 10,       // clusters to search at query time
    ..Default::default()
};
}

IVF Parameters

Parameter	Default	Description	Impact
`n_clusters`	100	Number of Voronoi cells	More clusters = faster search, lower recall
`n_probe`	1	Clusters to search at query time	Higher = better recall, slower search
`dimension`	(required)	Vector dimensions	Must match embedder output
`distance`	Cosine	Distance metric	See Distance Metrics below

Tuning tips:

Set n_clusters to roughly sqrt(n) where n is the number of vectors
Set n_probe to 5-20% of n_clusters for a good recall/speed trade-off
IVF requires a training phase — initial indexing may be slower

Distance Metrics

Metric	Description	Range	Best For
`Cosine`	1 - cosine similarity	[0, 2]	Text embeddings (most common)
`Euclidean`	L2 distance	[0, +inf)	Spatial data
`Manhattan`	L1 distance	[0, +inf)	Feature vectors
`DotProduct`	Negative inner product	(-inf, +inf)	Pre-normalized vectors
`Angular`	Angular distance	[0, pi]	Directional similarity

#![allow(unused)]
fn main() {
use laurus::vector::core::distance::DistanceMetric;

let metric = DistanceMetric::Cosine;      // Default for text
let metric = DistanceMetric::Euclidean;    // For spatial data
let metric = DistanceMetric::Manhattan;    // L1 distance
let metric = DistanceMetric::DotProduct;   // For pre-normalized vectors
let metric = DistanceMetric::Angular;      // Angular distance
}

Note: For cosine similarity, vectors are automatically L2-normalized before indexing. Lower distance = more similar.

Quantization

Quantization reduces memory usage by compressing vectors at the cost of some accuracy:

Method	Enum Variant	Description	Memory Reduction
Scalar 8-bit	`Scalar8Bit`	Scalar quantization to 8-bit integers	~4x
Product Quantization	`ProductQuantization { subvector_count }`	Splits vectors into sub-vectors and quantizes each	~16-64x

#![allow(unused)]
fn main() {
use laurus::vector::HnswOption;
use laurus::vector::core::quantization::QuantizationMethod;

let opt = HnswOption {
    dimension: 384,
    quantizer: Some(QuantizationMethod::Scalar8Bit),
    ..Default::default()
};
}

Segment Files

Each vector index type stores its data in a single segment file:

Index Type	File Extension	Contents
HNSW	`.hnsw`	Graph structure, vectors, and metadata
Flat	`.flat`	Raw vectors and metadata
IVF	`.ivf`	Cluster centroids, assigned vectors, and metadata

Code Example

use std::sync::Arc;
use laurus::{Document, Engine, Schema};
use laurus::lexical::TextOption;
use laurus::vector::HnswOption;
use laurus::vector::core::distance::DistanceMetric;
use laurus::storage::memory::MemoryStorage;

#[tokio::main]
async fn main() -> laurus::Result<()> {
    let storage = Arc::new(MemoryStorage::new(Default::default()));
    let schema = Schema::builder()
        .add_text_field("title", TextOption::default())
        .add_hnsw_field("embedding", HnswOption {
            dimension: 384,
            distance: DistanceMetric::Cosine,
            m: 16,
            ef_construction: 200,
            ..Default::default()
        })
        .build();

    // With an embedder, text in vector fields is automatically embedded
    let engine = Engine::builder(storage, schema)
        .embedder(my_embedder)
        .build()
        .await?;

    // Add text to the vector field — it will be embedded automatically
    engine.add_document("doc-1", Document::builder()
        .add_text("title", "Rust Programming")
        .add_text("embedding", "Rust is a systems programming language.")
        .build()
    ).await?;

    engine.commit().await?;

    Ok(())
}

Next Steps

Search the vector index: Vector Search
Combine with lexical search: Hybrid Search

Keyboard shortcuts

Laurus Documentation