Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

WASM Binding Overview

The laurus-wasm package provides WebAssembly bindings for the Laurus search engine. It enables lexical, vector, and hybrid search directly in browsers and edge runtimes (Cloudflare Workers, Vercel Edge Functions, Deno Deploy) without a server.

Features

  • Lexical Search – Full-text search powered by an inverted index with BM25 scoring
  • Vector Search – Approximate nearest neighbor (ANN) search using Flat, HNSW, or IVF indexes
  • Hybrid Search – Combine lexical and vector results with fusion algorithms (RRF, WeightedSum)
  • Rich Query DSL – Term, Phrase, Fuzzy, Wildcard, NumericRange, Geo, Boolean, Span queries
  • Text Analysis – Tokenizers, filters, and synonym expansion
  • In-memory Storage – Fast ephemeral indexes
  • OPFS Persistence – Indexes survive page reloads via the Origin Private File System
  • TypeScript Types – Auto-generated .d.ts type definitions
  • Async API – All I/O operations return Promises

Architecture

graph LR
    subgraph "laurus-wasm"
        WASM[wasm-bindgen API]
    end
    subgraph "laurus (core)"
        Engine
        MemoryStorage
    end
    subgraph "Browser"
        JS[JavaScript / TypeScript]
        OPFS[Origin Private File System]
    end
    JS --> WASM
    WASM --> Engine
    Engine --> MemoryStorage
    WASM -.->|persist| OPFS

Embedding Strategies

On native platforms Laurus supports several built-in embedders (Candle BERT, Candle CLIP, OpenAI API) that the engine can invoke automatically when a document is indexed or when searchVectorText("field", "query text") is called. These native embedders cannot run inside wasm32-unknown-unknown and are therefore disabled in the WASM build:

EmbedderDependencyWhy it cannot run in WASM
candle_bertcandle (GPU/SIMD)Requires native SIMD intrinsics and file system for models
candle_clipcandleSame as above
openaireqwest (HTTP)Requires a full async HTTP client (tokio + TLS)

(They are excluded from the WASM build via the embeddings-candle / embeddings-openai feature flags, which depend on the native feature that is disabled for wasm32-unknown-unknown.)

laurus-wasm exposes two addEmbedder types instead:

  • "precomputed" — The caller supplies vectors directly via putDocument() and searchVector(). The engine performs no embedding.
  • "callback" — Register a JavaScript callback embed: (text) => Promise<number[]> and the engine will invoke it during ingestion and from searchVectorText(). This enables in-engine auto-embedding using Transformers.js (or any other in-browser embedding library) so callers can use the same searchVectorText("field", "query text") pattern as on native platforms.

Option A — Precomputed vectors

Compute embeddings on the JavaScript side and pass precomputed vectors to putDocument() and searchVector():

// Using Transformers.js (all-MiniLM-L6-v2, 384-dim)
import { pipeline } from '@huggingface/transformers';

const embedder = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');

async function embed(text) {
  const output = await embedder(text, { pooling: 'mean', normalize: true });
  return Array.from(output.data);
}

// Index with precomputed embedding
const vec = await embed("Introduction to Rust");
await index.putDocument("doc1", { title: "Introduction to Rust", embedding: vec });
await index.commit();

// Search with precomputed query embedding
const queryVec = await embed("safe systems programming");
const results = await index.searchVector("embedding", queryVec);

This approach gives you real semantic search in the browser using the same sentence-transformer models available on native platforms, with the embedding computation handled by Transformers.js (ONNX Runtime Web) instead of candle.

Option B — Callback embedder

Register the same Transformers.js pipeline as a "callback" embedder so that the engine can call it automatically. After registration, ingestion and searchVectorText() work transparently without the caller managing vectors:

import { pipeline } from '@huggingface/transformers';

const extractor = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');

schema.addEmbedder("transformers", {
  type: "callback",
  embed: async (text) => {
    const output = await extractor(text, { pooling: 'mean', normalize: true });
    return Array.from(output.data);
  },
});
schema.addHnswField("embedding", 384, "cosine", undefined, undefined, "transformers");
const index = await Index.create(schema);

await index.putDocument("doc1", { title: "Introduction to Rust" });
await index.commit();

const results = await index.searchVectorText("embedding", "safe systems programming");

Compared to Option A, the callback approach lets the engine cache embeddings during ingestion and avoids duplicating embedding code between writers and readers. The trade-off is that every commit() waits for the JS callback to resolve, so heavy bulk ingestion may benefit from precomputing vectors.

When to Use laurus-wasm vs laurus-nodejs

Criterionlaurus-wasmlaurus-nodejs
EnvironmentBrowser, Edge RuntimeNode.js server
PerformanceGood (single-threaded)Best (native, multi-threaded)
StorageIn-memory + OPFSIn-memory + File system
EmbeddingPrecomputed + JS callbackCandle, OpenAI, Precomputed
Packagenpm install laurus-wasmnpm install laurus-nodejs
Binary size~5-10 MB (WASM)Platform-native