Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Vector Indexing

Vector indexing powers similarity-based search. When a document’s vector field is indexed, Laurus stores the embedding vector in a specialized index structure that enables fast approximate nearest neighbor (ANN) retrieval.

How Vector Indexing Works

sequenceDiagram
    participant Doc as Document
    participant Embedder
    participant Normalize as Normalizer
    participant Index as Vector Index

    Doc->>Embedder: "Rust is a systems language"
    Embedder-->>Normalize: [0.12, -0.45, 0.78, ...]
    Normalize->>Normalize: L2 normalize
    Normalize-->>Index: [0.14, -0.52, 0.90, ...]
    Index->>Index: Insert into index structure

Step by Step

  1. Embed: The text (or image) is converted to a vector by the configured embedder
  2. Normalize: The vector is L2-normalized (for cosine similarity)
  3. Index: The vector is inserted into the configured index structure (Flat, HNSW, or IVF)
  4. Commit: On commit(), the index is flushed to persistent storage

Index Types

Laurus supports three vector index types, each with different performance characteristics:

Comparison

PropertyFlatHNSWIVF
Accuracy100% (exact)~95-99% (approximate)~90-98% (approximate)
Search speedO(n) linear scanO(log n) graph walkO(n/k) cluster scan
Memory usageLowHigher (graph edges)Moderate (centroids)
Index build timeFastModerateSlower (clustering)
Best for< 10K vectors10K - 10M vectors> 1M vectors

Flat Index

The simplest index. Compares the query vector against every stored vector (brute-force).

#![allow(unused)]
fn main() {
use laurus::vector::FlatOption;
use laurus::vector::core::distance::DistanceMetric;

let opt = FlatOption {
    dimension: 384,
    distance: DistanceMetric::Cosine,
    ..Default::default()
};
}
  • Pros: 100% recall (exact results), simple, low memory
  • Cons: Slow for large datasets (linear scan)
  • Use when: You have fewer than ~10,000 vectors, or you need exact results

HNSW Index

Hierarchical Navigable Small World graph. The default and most commonly used index type.

graph TB
    subgraph "Layer 2 (sparse)"
        A2["A"] --- C2["C"]
    end

    subgraph "Layer 1 (medium)"
        A1["A"] --- B1["B"]
        A1 --- C1["C"]
        B1 --- D1["D"]
        C1 --- D1
    end

    subgraph "Layer 0 (dense - all vectors)"
        A0["A"] --- B0["B"]
        A0 --- C0["C"]
        B0 --- D0["D"]
        B0 --- E0["E"]
        C0 --- D0
        C0 --- F0["F"]
        D0 --- E0
        E0 --- F0
    end

    A2 -.->|"entry point"| A1
    A1 -.-> A0
    C2 -.-> C1
    C1 -.-> C0
    B1 -.-> B0
    D1 -.-> D0

The HNSW algorithm searches from the top (sparse) layer down to the bottom (dense) layer, narrowing the search space at each level.

#![allow(unused)]
fn main() {
use laurus::vector::HnswOption;
use laurus::vector::core::distance::DistanceMetric;

let opt = HnswOption {
    dimension: 384,
    distance: DistanceMetric::Cosine,
    m: 16,                  // max connections per node per layer
    ef_construction: 200,   // search width during index building
    ..Default::default()
};
}

HNSW Parameters

ParameterDefaultDescriptionImpact
m16Max bi-directional connections per layerHigher = better recall, more memory
ef_construction200Search width during index buildingHigher = better recall, slower build
dimension128Vector dimensionsMust match embedder output
distanceCosineDistance metricSee Distance Metrics below

Tuning tips:

  • Increase m (e.g., 32 or 64) for higher recall at the cost of memory
  • Increase ef_construction (e.g., 400) for better index quality at the cost of build time
  • At search time, the ef_search parameter (set in the search request) controls the search width

IVF Index

Inverted File Index. Partitions vectors into clusters, then only searches relevant clusters.

graph TB
    Q["Query Vector"]
    Q --> C1["Cluster 1\n(centroid)"]
    Q --> C2["Cluster 2\n(centroid)"]

    C1 --> V1["vec_3"]
    C1 --> V2["vec_7"]
    C1 --> V3["vec_12"]

    C2 --> V4["vec_1"]
    C2 --> V5["vec_9"]
    C2 --> V6["vec_15"]

    style C1 fill:#f9f,stroke:#333
    style C2 fill:#f9f,stroke:#333
#![allow(unused)]
fn main() {
use laurus::vector::IvfOption;
use laurus::vector::core::distance::DistanceMetric;

let opt = IvfOption {
    dimension: 384,
    distance: DistanceMetric::Cosine,
    n_clusters: 100,   // number of clusters
    n_probe: 10,       // clusters to search at query time
    ..Default::default()
};
}

IVF Parameters

ParameterDefaultDescriptionImpact
n_clusters100Number of Voronoi cellsMore clusters = faster search, lower recall
n_probe1Clusters to search at query timeHigher = better recall, slower search
dimension(required)Vector dimensionsMust match embedder output
distanceCosineDistance metricSee Distance Metrics below

Tuning tips:

  • Set n_clusters to roughly sqrt(n) where n is the number of vectors
  • Set n_probe to 5-20% of n_clusters for a good recall/speed trade-off
  • IVF requires a training phase — initial indexing may be slower

Distance Metrics

MetricDescriptionRangeBest For
Cosine1 - cosine similarity[0, 2]Text embeddings (most common)
EuclideanL2 distance[0, +inf)Spatial data
ManhattanL1 distance[0, +inf)Feature vectors
DotProductNegative inner product(-inf, +inf)Pre-normalized vectors
AngularAngular distance[0, pi]Directional similarity
#![allow(unused)]
fn main() {
use laurus::vector::core::distance::DistanceMetric;

let metric = DistanceMetric::Cosine;      // Default for text
let metric = DistanceMetric::Euclidean;    // For spatial data
let metric = DistanceMetric::Manhattan;    // L1 distance
let metric = DistanceMetric::DotProduct;   // For pre-normalized vectors
let metric = DistanceMetric::Angular;      // Angular distance
}

Note: For cosine similarity, vectors are automatically L2-normalized before indexing. Lower distance = more similar.

Quantization

Quantization reduces memory usage by compressing vectors at the cost of some accuracy:

MethodEnum VariantDescriptionMemory Reduction
Scalar 8-bitScalar8BitScalar quantization to 8-bit integers~4x
Product QuantizationProductQuantization { subvector_count }Splits vectors into sub-vectors and quantizes each~16-64x
#![allow(unused)]
fn main() {
use laurus::vector::HnswOption;
use laurus::vector::core::quantization::QuantizationMethod;

let opt = HnswOption {
    dimension: 384,
    quantizer: Some(QuantizationMethod::Scalar8Bit),
    ..Default::default()
};
}

Segment Files

Each vector index type stores its data in a single segment file:

Index TypeFile ExtensionContents
HNSW.hnswGraph structure, vectors, and metadata
Flat.flatRaw vectors and metadata
IVF.ivfCluster centroids, assigned vectors, and metadata

Code Example

use std::sync::Arc;
use laurus::{Document, Engine, Schema};
use laurus::lexical::TextOption;
use laurus::vector::HnswOption;
use laurus::vector::core::distance::DistanceMetric;
use laurus::storage::memory::MemoryStorage;

#[tokio::main]
async fn main() -> laurus::Result<()> {
    let storage = Arc::new(MemoryStorage::new(Default::default()));
    let schema = Schema::builder()
        .add_text_field("title", TextOption::default())
        .add_hnsw_field("embedding", HnswOption {
            dimension: 384,
            distance: DistanceMetric::Cosine,
            m: 16,
            ef_construction: 200,
            ..Default::default()
        })
        .build();

    // With an embedder, text in vector fields is automatically embedded
    let engine = Engine::builder(storage, schema)
        .embedder(my_embedder)
        .build()
        .await?;

    // Add text to the vector field — it will be embedded automatically
    engine.add_document("doc-1", Document::builder()
        .add_text("title", "Rust Programming")
        .add_text("embedding", "Rust is a systems programming language.")
        .build()
    ).await?;

    engine.commit().await?;

    Ok(())
}

Next Steps