Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

API Reference

Index

The primary entry point. Wraps the Laurus search engine.

class Index:
    def __init__(self, path: str | None = None, schema: Schema | None = None) -> None: ...

Constructor

ParameterTypeDefaultDescription
pathstr | NoneNoneDirectory path for persistent storage. None creates an in-memory index.
schemaSchema | NoneNoneSchema definition. An empty schema is used when omitted.

Methods

MethodDescription
put_document(id, doc)Upsert a document. Replaces all existing versions with the same ID.
add_document(id, doc)Append a document chunk without removing existing versions.
get_documents(id) -> list[dict]Return all stored versions for the given ID.
delete_documents(id)Delete all versions for the given ID.
commit()Flush buffered writes and make all pending changes searchable.
search(query, *, limit=10, offset=0) -> list[SearchResult]Execute a search query.
stats() -> dictReturn index statistics (document_count, vector_fields).

search query argument

The query parameter accepts any of the following:

  • A DSL string (e.g. "title:hello", "~\"memory safety\"")
  • A lexical query object (TermQuery, PhraseQuery, BooleanQuery, …)
  • A vector query object (VectorQuery, VectorTextQuery)
  • A SearchRequest for full control

Schema

Defines the fields and index types for an Index.

class Schema:
    def __init__(self) -> None: ...

Field methods

MethodDescription
add_text_field(name)Full-text field (inverted index, BM25).
add_int_field(name)64-bit integer field.
add_float_field(name)64-bit float field.
add_bool_field(name)Boolean field.
add_bytes_field(name)Raw bytes field.
add_geo_field(name)Geographic coordinate field (lat/lon).
add_datetime_field(name)UTC datetime field.
add_hnsw_field(name, dimension, *, distance="cosine", m=16, ef_construction=100)HNSW approximate nearest-neighbor vector field.
add_flat_field(name, dimension, *, distance="cosine")Flat (brute-force) vector field.
add_ivf_field(name, dimension, *, distance="cosine", n_clusters=100, n_probe=1)IVF approximate nearest-neighbor vector field.

Distance metrics

ValueDescription
"cosine"Cosine similarity (default)
"euclidean"Euclidean distance
"dot_product"Dot product

Query classes

TermQuery

TermQuery(field: str, term: str)

Matches documents containing the exact term in the given field.

PhraseQuery

PhraseQuery(field: str, terms: list[str])

Matches documents containing the terms in order.

FuzzyQuery

FuzzyQuery(field: str, term: str, max_edits: int = 1)

Approximate match allowing up to max_edits edit-distance errors.

WildcardQuery

WildcardQuery(field: str, pattern: str)

Pattern match. * matches any sequence of characters, ? matches any single character.

NumericRangeQuery

NumericRangeQuery(field: str, min: int | float | None, max: int | float | None)

Matches numeric values in the range [min, max]. Pass None for an open bound.

GeoQuery

GeoQuery(field: str, lat: float, lon: float, radius_km: float)

Geo-distance search. Returns documents whose (lat, lon) coordinate is within radius_km of the given point.

BooleanQuery

BooleanQuery(
    must: list[Query] | None = None,
    should: list[Query] | None = None,
    must_not: list[Query] | None = None,
)

Compound boolean query. must clauses all have to match; at least one should clause must match; must_not clauses must not match.

SpanNearQuery

SpanNearQuery(field: str, terms: list[str], slop: int = 0, in_order: bool = True)

Matches documents where the terms appear within slop positions of each other.

VectorQuery

VectorQuery(field: str, vector: list[float])

Approximate nearest-neighbor search using a pre-computed embedding vector.

VectorTextQuery

VectorTextQuery(field: str, text: str)

Converts text to an embedding at query time and runs vector search. Requires an embedder configured on the index.


SearchRequest

Full-featured search request for advanced control.

class SearchRequest:
    def __init__(
        self,
        *,
        query=None,
        lexical_query=None,
        vector_query=None,
        filter_query=None,
        fusion=None,
        limit: int = 10,
        offset: int = 0,
    ) -> None: ...
ParameterDescription
queryA DSL string or single query object. Mutually exclusive with lexical_query / vector_query.
lexical_queryLexical component for explicit hybrid search.
vector_queryVector component for explicit hybrid search.
filter_queryLexical filter applied after scoring.
fusionFusion algorithm (RRF or WeightedSum). Defaults to RRF(k=60) when both components are set.
limitMaximum number of results (default 10).
offsetPagination offset (default 0).

SearchResult

Returned by Index.search().

class SearchResult:
    id: str          # External document identifier
    score: float     # Relevance score
    document: dict | None  # Retrieved field values, or None if deleted

Fusion algorithms

RRF

RRF(k: float = 60.0)

Reciprocal Rank Fusion. Merges lexical and vector result lists by rank position. k is a smoothing constant; higher values reduce the influence of top-ranked results.

WeightedSum

WeightedSum(lexical_weight: float = 0.5, vector_weight: float = 0.5)

Normalises both score lists independently, then combines them as lexical_weight * lexical_score + vector_weight * vector_score.


Text analysis

SynonymDictionary

class SynonymDictionary:
    def __init__(self) -> None: ...
    def add_synonym_group(self, synonyms: list[str]) -> None: ...

WhitespaceTokenizer

class WhitespaceTokenizer:
    def __init__(self) -> None: ...
    def tokenize(self, text: str) -> list[Token]: ...

SynonymGraphFilter

class SynonymGraphFilter:
    def __init__(
        self,
        dictionary: SynonymDictionary,
        keep_original: bool = True,
        boost: float = 1.0,
    ) -> None: ...
    def apply(self, tokens: list[Token]) -> list[Token]: ...

Token

class Token:
    text: str
    position: int
    position_increment: int
    position_length: int
    boost: float

Field value types

Python values are automatically converted to Laurus DataValue types:

Python typeLaurus typeNotes
NoneNull
boolBoolChecked before int
intInt64
floatFloat64
strText
bytesBytes
list[float]VectorElements coerced to f32
(lat, lon) tupleGeoTwo float values
datetime.datetimeDateTimeConverted via isoformat()