Architecture

Iris is built on a unified modular architecture where the Engine serves as the core orchestrator.

1. Engine (Unified)

The primary engine associated with the library. It unifies vector similarity search with full-text search capabilities.

Orchestration: Manages both VectorStore (HNSW/IVF/Flat index) and LexicalStore (Inverted Index).
Hybrid Search: Performs unified queries combining vector similarity and keyword relevance.
ID Management: Manages external ID to internal integer ID mapping.

2. LexicalStore (Component)

Operates as a component managed by the Engine, handling full-text search.

Inverted Index: Standard posting lists for term lookups.
Analyzers: Tokenization and normalization pipeline.
Query Parser: Supports boolean, phrase, and structured queries.

3. VectorStore (Component)

Operates as a component managed by the Engine, handling vector similarity search.

Vector Index: Supports HNSW, IVF, and Flat index types.
Embedder: Automatic text/image to vector embedding.
Distance Metrics: Cosine, Euclidean, and DotProduct similarity.

graph TD
    subgraph "Application Layer"
        User[User / App]
        Req[SearchRequest]
    end

    subgraph "Iris Engine"
        E[Engine]

        subgraph "Components"
            VS[VectorStore]
            LS[LexicalStore]
            DS[DocumentStore]
            WAL[Write-Ahead Log]
        end

        Fusion[Result Fusion]
    end

    subgraph "Storage Layer"
        FS[FileStorage / Mmap]
    end

    %% Flows
    User -->|index/search| E
    E --> VS
    E --> LS
    E --> DS
    E --> WAL

    LS --> FS
    VS --> FS
    DS --> FS
    WAL --> FS

    %% Search Flow
    Req --> E
    E -->|Vector Query| VS
    E -->|Keyword Query| LS

    VS -->|Hits| Fusion
    LS -->|Hits| Fusion

    Fusion -->|Unified Results| User

Storage Layer

All components abstract their storage through a Storage trait, allowing seamless switching between:

Memory: For testing and ephemeral data.
File: For persistent on-disk storage.
Mmap: For high-performance memory-mapped file access.

Component Structure

Each store follows a simplified 4-member structure pattern:

#![allow(unused)]
fn main() {
pub struct LexicalStore {
    index: Box<dyn LexicalIndex>,
    writer_cache: Mutex<Option<Box<dyn LexicalIndexWriter>>>,
    searcher_cache: RwLock<Option<Box<dyn LexicalIndexSearcher>>>,
    doc_store: Arc<RwLock<UnifiedDocumentStore>>,
}

pub struct VectorStore {
    index: Box<dyn VectorIndex>,
    writer_cache: Mutex<Option<Box<dyn VectorIndexWriter>>>,
    searcher_cache: RwLock<Option<Box<dyn VectorIndexSearcher>>>,
    doc_store: Arc<RwLock<UnifiedDocumentStore>>,
}
}

This pattern provides:

Lazy Initialization: Writers and searchers are created on-demand.
Cache Invalidation: Searcher cache is invalidated after commit/optimize.
Shared Document Store: Both stores share the same document storage.