Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

API Reference

Index

The primary entry point. Wraps the Laurus search engine.

new \Laurus\Index(?string $path = null, ?Schema $schema = null)

Constructor

ParameterTypeDefaultDescription
$pathstring|nullnullDirectory path for persistent storage. null creates an in-memory index.
$schemaSchema|nullnullSchema definition. An empty schema is used when omitted.

Methods

MethodDescription
putDocument(string $id, array $doc): voidUpsert a document. Replaces all existing versions with the same ID.
addDocument(string $id, array $doc): voidAppend a document chunk without removing existing versions.
getDocuments(string $id): arrayReturn all stored versions for the given ID.
deleteDocuments(string $id): voidDelete all versions for the given ID.
commit(): voidFlush buffered writes and make all pending changes searchable.
search(mixed $query, int $limit = 10, int $offset = 0): arrayExecute a search query. Returns an array of SearchResult.
stats(): arrayReturn index statistics ("document_count", "vector_fields").

search query argument

The $query parameter accepts any of the following:

  • A DSL string (e.g. "title:hello", "embedding:\"memory safety\"")
  • A lexical query object (TermQuery, PhraseQuery, BooleanQuery, …)
  • A vector query object (VectorQuery, VectorTextQuery)
  • A SearchRequest for full control

Schema

Defines the fields and index types for an Index.

new \Laurus\Schema()

Field methods

MethodDescription
addTextField(string $name, bool $stored = true, bool $indexed = true, bool $termVectors = false, ?string $analyzer = null): voidFull-text field (inverted index, BM25).
addIntegerField(string $name, bool $stored = true, bool $indexed = true, bool $multi_valued = false): void64-bit integer field. Pass $multi_valued = true to accept arrays of integers (range queries match if any value satisfies the predicate).
addFloatField(string $name, bool $stored = true, bool $indexed = true, bool $multi_valued = false): void64-bit float field. Pass $multi_valued = true to accept arrays of floats (range queries match if any value satisfies the predicate).
addBooleanField(string $name, bool $stored = true, bool $indexed = true): voidBoolean field.
addBytesField(string $name, bool $stored = true): voidRaw bytes field.
addGeoField(string $name, bool $stored = true, bool $indexed = true): voidGeographic coordinate field (lat/lon).
addGeo3dField(string $name, bool $stored = true, bool $indexed = true): void3D ECEF Cartesian point field (x, y, z in metres). See Geo3d concepts.
addDatetimeField(string $name, bool $stored = true, bool $indexed = true): voidUTC datetime field.
addHnswField(string $name, int $dimension, ?string $distance = "cosine", int $m = 16, int $efConstruction = 200, ?string $embedder = null): voidHNSW approximate nearest-neighbor vector field.
addFlatField(string $name, int $dimension, ?string $distance = "cosine", ?string $embedder = null): voidFlat (brute-force) vector field.
addIvfField(string $name, int $dimension, ?string $distance = "cosine", int $nClusters = 100, int $nProbe = 1, ?string $embedder = null): voidIVF approximate nearest-neighbor vector field.

Other methods

MethodDescription
addEmbedder(string $name, array $config): voidRegister a named embedder definition. $config is an associative array with a "type" key (see below).
setDefaultFields(array $fieldNames): voidSet the default fields used when no field is specified in a query. $fieldNames is an array of strings.
setDynamicFieldPolicy(string $policy): voidSet how undeclared fields are handled. $policy is "strict", "dynamic" (default), or "ignore". See notes below.
dynamicFieldPolicy(): stringReturn the current policy as a lowercase string.
fieldNames(): arrayReturn the list of field names defined in this schema.

Dynamic field policy

Controls what happens when a document is ingested with field names that are not declared in the schema:

  • "strict" — Reject the document.
  • "dynamic" (default) — Infer a type for each undeclared field and add it to the schema. Warning: integer fields silently truncate incoming float values (3.143). Use "strict" if you need to reject such type mismatches.
  • "ignore" — Silently drop the undeclared fields.

See Schema & Fields for the full behaviour matrix.

Embedder types

"type"Required keysFeature flag
"precomputed"(always available)
"candle_bert""model"embeddings-candle
"candle_clip""model"embeddings-multimodal
"openai""model"embeddings-openai

Distance metrics

ValueDescription
"cosine"Cosine similarity (default)
"euclidean"Euclidean distance
"dot_product"Dot product
"manhattan"Manhattan distance
"angular"Angular distance

Query classes

TermQuery

new \Laurus\TermQuery(string $field, string $term)

Matches documents containing the exact term in the given field.

PhraseQuery

new \Laurus\PhraseQuery(string $field, array $terms)

Matches documents containing the terms in order. $terms is an array of strings.

FuzzyQuery

new \Laurus\FuzzyQuery(string $field, string $term, int $maxEdits = 2)

Approximate match allowing up to $maxEdits edit-distance errors.

WildcardQuery

new \Laurus\WildcardQuery(string $field, string $pattern)

Pattern match. * matches any sequence of characters, ? matches any single character.

NumericRangeQuery

new \Laurus\NumericRangeQuery(string $field, mixed $min, mixed $max, ?string $numericType = "integer")

Matches numeric values in the range [$min, $max]. Pass null for an open bound. Set $numericType to "integer" or "float".

GeoQuery

// Radius search
\Laurus\GeoQuery::withinRadius(string $field, float $lat, float $lon, float $distanceKm): GeoQuery

// Bounding box search
\Laurus\GeoQuery::withinBoundingBox(string $field, float $minLat, float $minLon, float $maxLat, float $maxLon): GeoQuery

withinRadius returns documents whose coordinate is within $distanceKm of the given point. withinBoundingBox returns documents within the specified bounding box.

Geo3dDistanceQuery

\Laurus\Geo3dDistanceQuery::withinSphere(
    string $field,
    float $x, float $y, float $z,
    float $radiusM,
): Geo3dDistanceQuery

Sphere search over a 3D ECEF point field. Returns documents whose (x, y, z) coordinate is within $radiusM metres of the centre. See Geo3d concepts for ECEF theory.

Geo3dBoundingBoxQuery

\Laurus\Geo3dBoundingBoxQuery::withinBox(
    string $field,
    float $minX, float $minY, float $minZ,
    float $maxX, float $maxY, float $maxZ,
): Geo3dBoundingBoxQuery

Axis-aligned 3D bounding-box search.

Geo3dNearestQuery

\Laurus\Geo3dNearestQuery::kNearest(
    string $field,
    float $x, float $y, float $z,
    int $k,
): Geo3dNearestQuery

k-nearest-neighbour search over a 3D ECEF point field. The initial_radius_m / max_radius_m tuning parameters of the core query are not yet exposed in the PHP binding — see #344 for parity across bindings.

BooleanQuery

$bq = new \Laurus\BooleanQuery();
$bq->must($query);
$bq->should($query);
$bq->mustNot($query);

Compound boolean query. must clauses all have to match; at least one should clause must match; mustNot clauses must not match.

SpanQuery

// Single term
\Laurus\SpanQuery::term(string $field, string $term): SpanQuery

// Near: terms within slop positions
\Laurus\SpanQuery::near(string $field, array $terms, int $slop = 0, bool $ordered = true): SpanQuery

// Containing: big span contains little span
\Laurus\SpanQuery::containing(string $field, SpanQuery $big, SpanQuery $little): SpanQuery

// Within: include span within exclude span at max distance
\Laurus\SpanQuery::within(string $field, SpanQuery $include, SpanQuery $exclude, int $distance): SpanQuery

Positional / proximity span queries. near takes an array of term strings.

VectorQuery

new \Laurus\VectorQuery(string $field, array $vector)

Approximate nearest-neighbor search using a pre-computed embedding vector. $vector is an array of floats.

VectorTextQuery

new \Laurus\VectorTextQuery(string $field, string $text)

Converts $text to an embedding at query time and runs vector search. Requires an embedder configured on the index.


SearchRequest

Full-featured search request for advanced control.

new \Laurus\SearchRequest(
    mixed $query = null,
    mixed $lexicalQuery = null,
    mixed $vectorQuery = null,
    mixed $filterQuery = null,
    mixed $fusion = null,
    int $limit = 10,
    int $offset = 0,
)
ParameterDescription
$queryA DSL string or single query object. Mutually exclusive with $lexicalQuery / $vectorQuery.
$lexicalQueryLexical component for explicit hybrid search.
$vectorQueryVector component for explicit hybrid search.
$filterQueryLexical filter applied after scoring.
$fusionFusion algorithm (RRF or WeightedSum). Defaults to RRF(k: 60) when both components are set.
$limitMaximum number of results (default 10).
$offsetPagination offset (default 0).

SearchResult

Returned by Index->search().

$result->getId()        // string   -- External document identifier
$result->getScore()     // float    -- Relevance score
$result->getDocument()  // array|null -- Retrieved field values, or null if deleted

Fusion algorithms

RRF

new \Laurus\RRF(float $k = 60.0)

Reciprocal Rank Fusion. Merges lexical and vector result lists by rank position. $k is a smoothing constant; higher values reduce the influence of top-ranked results.

WeightedSum

new \Laurus\WeightedSum(float $lexicalWeight = 0.5, float $vectorWeight = 0.5)

Normalises both score lists independently, then combines them as $lexicalWeight * lexical_score + $vectorWeight * vector_score.


Text analysis

SynonymDictionary

$dict = new \Laurus\SynonymDictionary();
$dict->addSynonymGroup(["fast", "quick", "rapid"]);

A dictionary of synonym groups. All terms in a group are treated as synonyms of each other.

WhitespaceTokenizer

$tokenizer = new \Laurus\WhitespaceTokenizer();
$tokens = $tokenizer->tokenize("hello world");

Splits text on whitespace boundaries and returns an array of Token objects.

SynonymGraphFilter

$filter = new \Laurus\SynonymGraphFilter($dictionary, true, 1.0);
$expanded = $filter->apply($tokens);

Token filter that expands tokens with their synonyms from a SynonymDictionary.

Token

$token->getText()               // string  -- The token text
$token->getPosition()           // int     -- Position in the token stream
$token->getStartOffset()        // int     -- Character start offset in the original text
$token->getEndOffset()          // int     -- Character end offset in the original text
$token->getBoost()              // float   -- Score boost factor (1.0 = no adjustment)
$token->isStopped()             // bool    -- Whether removed by a stop filter
$token->getPositionIncrement()  // int     -- Difference from the previous token's position
$token->getPositionLength()     // int     -- Number of positions spanned

Field value types

PHP values are automatically converted to Laurus DataValue types:

PHP typeLaurus typeNotes
nullNull
true / falseBool
intInt64
floatFloat64
stringText
array of numericsVectorElements coerced to f32
array with "lat", "lon"GeoTwo float values
string (ISO 8601)DateTimeParsed from ISO 8601 format