# Master Data Query

This document defines the cross-target query model exposed over master data collections. The user-visible model is uniform across every target; per-target call shape, naming, and threading slots are documented in [codegen/golang.md](../codegen/golang.md), [codegen/typescript.md](../codegen/typescript.md), and [codegen/csharp.md](../codegen/csharp.md).

## Relation

Every master decomposes at the codegen level into a **Record** (the value type for one row) and a **Relation** (the chainable query surface). The relation is the only user-facing model for querying records.

A relation is a data-less, immutable value: it carries the source-level master identity plus an ordered list of staged operations (predicates, orderings, skip, take). It does not own records. The active record set is supplied at terminal execution time through a [`MasterData`](schema.md#runtime-model) value resolved by the per-target mechanism (Go: `From(ctx)`; C#: `data` parameter; TypeScript: `data` parameter).

Every relation is reached through a per-master entrypoint named after the master:

| Target | Entrypoint |
| --- | --- |
| Go | a package-level value `Items` of type `ItemsRelation` |
| C# | a static class field `Items` of type `ItemsRelation` |
| TypeScript | a module-level constant `items` of type `ItemsRelation` |

Authoring a query never names `MasterData` directly. The user starts from the entrypoint, applies zero or more stages, and finishes with a terminal that resolves the data and returns the result.

```mst
for item in Items.toList() { ... }
```

In the source program, master iteration uses the surface method `toList()` (see [Iteration](#iteration)). Codegen lowers the call onto the per-target relation terminal that returns the materialised record list.

## Query Plan

The relation carries an inspectable **QueryPlan** that records:

- the source master identifier (so a backend can dispatch the plan onto its own table);
- an ordered list of predicates appended by `Where`;
- an ordered list of orderings — the first set by `OrderBy`, subsequent ones appended by `ThenBy`;
- a non-negative skip count, defaulting to zero;
- a take limit, where a negative value means unlimited.

The plan is part of the runtime model; the source program never names it. The runtime types that back the plan are:

- `QueryPlan[R]` (or the target equivalent) — the plan value type, parametrised by the record type R.
- `Predicate[R]` — an interface backed by exported concrete struct types (`EqPredicate`, `NePredicate`, `LtPredicate`, `LePredicate`, `GtPredicate`, `GePredicate`, `InPredicate`, `BetweenPredicate`, `BoolEqPredicate`, `BoolNePredicate`, `BoolInPredicate`, `AndPredicate`, `OrPredicate`, `NotPredicate`). Each struct exposes its operator-relevant metadata (`FieldRef`, comparison value, operands) as named fields so a backend can translate the node to SQL without invoking the per-record accessor.
- `Ordering[R]` — an interface backed by `AscOrdering` and `DescOrdering` carrying the field reference and direction.
- `FieldRef` — a structural pointer to a record field by its source-level name. Backends translate `FieldRef.Name` into their column naming convention.
- Per-master field handles (`OrderedField[R, V]` / `BoolField[R]`) — typed constructors that produce predicate and ordering nodes against a stable field reference.

The structural shape is identical across targets. Concrete language-level names may differ (`IPredicate<R>` on C#, `Predicate<R>` on TypeScript and Go) but the operator vocabulary and field-metadata contract is the same.

The in-memory executor is one consumer of this plan; future backends (indexed JSON, SQLite) are additional consumers of the same plan. The plan stays structurally inspectable so a backend can translate predicates to SQL without ever evaluating the per-record accessor closures the in-memory executor uses.

The user program does not see the plan unless it deliberately type-switches on the runtime nodes for testing or debugging. Generated public types are relation-shaped; the internal plan types are reserved for runtime / backend use.

## Stage Operations

Stage operations build up the query plan. Every stage returns a new relation; the receiver is never mutated.

| Operation | Effect on the plan |
| --- | --- |
| `Where(predicate)` | Appends `predicate` to the predicate list. Multiple `Where` calls accumulate as a conjunction. |
| `OrderBy(ordering)` | Replaces the ordering list with `[ordering]`. The first `OrderBy` wins for the primary sort key; tie-breakers compose through `ThenBy`. |
| `ThenBy(ordering)` | Appends `ordering` to the existing ordering list; the existing keys win, the appended key breaks ties. |
| `Skip(n)` | Sets the skip count to `n`. Applied after predicates and ordering. |
| `Take(n)` | Sets the take count to `n`. Applied after `Skip`. `Take(0)` yields an empty result. A negative value is the per-target "no limit" sentinel. |
| `Select<Name>()` | Switches to a projected relation. The source-side plan flows into the projection; predicates and orderings added after the call are typed on the projected record. See [Projections](#projections). |
| `Join<Field>(right)` | Switches to a joined relation. Each surviving source record is matched against the supplied right relation via primary-key lookup; the call returns a pair relation. See [Joins](#joins). |

Stage operations are pure: they record intent but do not read the dataset. They synchronous and return immediately.

### Copy-on-write

Relations are copy-on-write. A base relation can be shared across independent chains without aliasing:

```go
base := Items.Where(ItemsFields.Count.Ge(10))
a := base.Take(1)
b := base.Take(2)
```

`a`, `b`, and `base` are independent values: each terminal sees only the stages explicitly chained onto its receiver, and stages applied to one do not affect the others. The same property holds across every target.

## Terminal Operations

Terminal operations resolve the active dataset, execute the plan, and return a result. Every terminal carries the `asyncable`, `cancellable`, and `failable` effects so backends that can fail surface the failure through the per-target failable transport (Go: trailing `error`; TypeScript: thrown exception; C#: thrown exception).

| Operation | Result |
| --- | --- |
| `ToSlice` / `ToList` / `toArray` | The full filtered, sorted, skipped, taken record sequence as a list. This is the canonical list terminal; an unfiltered relation returns every record. |
| `Iter` / `AsAsyncEnumerable` | A streaming iterator over the same sequence (Go: `iter.Seq2[Record, error]`; C#: `IAsyncEnumerable<Record>`). TypeScript exposes `toArray` only; an iterator API may follow in a later phase. |
| `FindBy` / `findBy` | Primary-key lookup. Applies the relation's predicate chain to the matched record so a filter chained before `FindBy` excludes records that would otherwise match the key. Skip / Take / OrderBy stages do not affect `FindBy`. Returns the per-target "no match" sentinel when no record satisfies both the key and the filter chain. |
| `FirstOrDefault` / `firstOrDefault` | The first record of the same sequence as `ToSlice`, paired with the per-target no-match sentinel (Go `(record, false, nil)`; TS `undefined`; C# `null`). |
| `Count` / `count` | The cardinality of the sequence as an integer. |
| `Any` / `any` | Boolean: true when the sequence is non-empty. |

Terminal execution is observational: it does not re-import data, does not mutate the relation, and observes the same record set across calls within one run.

The legacy `All` / `all` terminal is no longer part of the documented API. The list terminal (`ToSlice` / `ToList` / `toArray`) is the only documented way to materialise every record.

### Iteration Iterator Semantics

The iterator terminals consume the same plan as the list terminal and yield records one at a time:

- The iterator stops cleanly after the last record.
- A backend failure surfaces as one (zero record, error) pair before the iterator stops, so the loop body can observe and propagate the error.
- A relation whose plan resolves successfully but produces no records yields zero pairs and stops.

The in-memory backend materialises the result before yielding; future backends are free to stream record by record as long as the iterator's external contract holds.

### FindBy and Filters

`FindBy` honours every `Where` predicate already chained onto the relation but ignores `OrderBy`, `ThenBy`, `Skip`, and `Take` because they cannot change which row is identified by the primary key. The lookup proceeds as:

1. Scan the master's records for a row whose primary-key fields all equal the supplied arguments.
2. If no row matches, return the per-target no-match sentinel.
3. Apply the relation's predicate list to the matched row. If any predicate fails, return the per-target no-match sentinel.
4. Otherwise, return the matched row.

Projected and joined relations do not expose `FindBy`: their output records do not have a primary-key concept.

## Field Builder

Predicates and orderings are constructed through a per-master typed handle that exposes one entry per supported record field.

- Go uses a package-level `<Master>Fields` variable: `ItemsFields.Count.Ge(10)`.
- C# passes a callback typed on the same `<Master>Fields` class: `Items.Where(item => item.Count.Ge(10))`.
- TypeScript passes a callback typed on the same `<Master>Fields` interface: `items.where(item => item.count.ge(10))`.

A field handle is not a record value. Its purpose is to construct predicate and ordering nodes that carry the field reference, the operator name, and the operand value.

The operator vocabulary, uniform across targets:

| Method | Applies to | Builds |
| --- | --- | --- |
| `Eq(value)` / `eq(value)` | any handle | `<field> == <value>` |
| `Ne(value)` / `ne(value)` | any handle | `<field> != <value>` |
| `Lt(value)` / `lt(value)` | ordered handles only | `<field> < <value>` |
| `Le(value)` / `le(value)` | ordered handles only | `<field> <= <value>` |
| `Gt(value)` / `gt(value)` | ordered handles only | `<field> > <value>` |
| `Ge(value)` / `ge(value)` | ordered handles only | `<field> >= <value>` |
| `In(values...)` / `in(values...)` | any handle | `<field> ∈ {values...}` |
| `Between(low, high)` / `between(low, high)` | ordered handles only | inclusive range `low <= <field> <= high` |
| `Asc()` / `asc()` | ordered handles only | ascending ordering |
| `Desc()` / `desc()` | ordered handles only | descending ordering |

A bool field handle exposes only `Eq`, `Ne`, and `In`; calling an ordering or range method on a bool handle is a compile-time error because bool is comparable but not ordered.

The combinators `And`, `Or`, `Not` (or their per-target equivalents `Predicates.And` etc.) compose predicates within a single record type. Mixing record types across a combinator is a compile-time error.

## Source-Level Relation Queries

The stage operators above are codegen-/runtime-level for direct use: a Masterbelt source program cannot write `Items.Where(...)` at an arbitrary call site, and the only relation terminal reachable from source is `toList()`. The exception is a master's [scope section](schema.md#scope-section): inside a scope body the relation **stage** operators are available as source-level methods on the master's `Relation<M>`, so a scope can name a reusable query fragment that the rest of the program applies by calling the scope.

`Relation<M>` is the source-level type of the query surface for master `M`. It is a data-less, immutable, copy-on-write value carrying the staged plan described under [Query Plan](#query-plan); a scope receives one as `self`, threads stages onto it, and returns one. The master's surface name is the base relation entrypoint, so `Records` denotes the base `Relation<Records>` from which `Records.adult()` and `Records.gendered(1)` start. `Relation<M>` is distinct from `Record<M>` and from `list<Record<M>>`: a record is one row, a list is a materialised sequence, and a relation is the unresolved query.

### Source-Level Stage Operators

A scope body reaches these stage methods on `Relation<M>`. Each returns a fresh `Relation<M>`; the receiver is never mutated.

| Method | Effect on the plan |
| --- | --- |
| `self.where(fn(row) => <predicate>)` | Appends the predicate. Multiple `where` calls accumulate as a conjunction. |
| `self.orderBy(fn(row) => <ordering>)` | Sets the primary ordering key, replacing any existing ordering list. |
| `self.thenBy(fn(row) => <ordering>)` | Appends a tie-breaker ordering key. |
| `self.skip(n)` | Sets the skip count. |
| `self.take(n)` | Sets the take limit. |

The user-declared scopes of `M` surface as additional methods on `Relation<M>` (see [schema.md#calling-and-chaining](schema.md#scope-section)). The source-level surface deliberately omits the terminal operators (`toList`, `findBy`, `count`, …) and the projection / join switches: those carry effects or change the record type, neither of which a scope may do. `skip` and `take` keep the same runtime/codegen semantics as the corresponding plan stages.

### Query Callbacks

`where`, `orderBy`, and `thenBy` take a callback written with the ordinary `fn(row) => …` surface syntax, but the checker and lowering treat the callback as a **query predicate / ordering DSL**, not as an ordinary function value:

- `row` is a field-handle view of the record `M`, not a record value. `row.<field>` resolves to the per-field handle for that field; a reference to a field the record does not declare is reported as `masterbelt.checker.scope_unknown_field`.
- A `where` callback must produce a predicate; an `orderBy` / `thenBy` callback must produce an ordering.
- The callback may reference scope parameters and module-level user functions / statics / consts. Such references are runtime parameters of the resulting predicate. A referenced function / static / const that carries `failable` / `cancellable` / `asyncable` is reported as `masterbelt.checker.scope_forbidden_effect`; the callback itself never inherits those effects.
- No new callback syntax is introduced — the `fn(...) => ...` spelling is reused.

### Field-Handle Operators

`row.<field>` exposes the same operator vocabulary as the codegen [Field Builder](#field-builder), using the lowercase source spelling:

| Method | Applies to | Builds |
| --- | --- | --- |
| `eq(value)` / `ne(value)` | any handle | equality / inequality predicate |
| `lt` / `le` / `gt` / `ge` `(value)` | ordered handles only | comparison predicate |
| `in(values...)` | any handle | membership predicate |
| `between(low, high)` | ordered handles only | inclusive range predicate |
| `asc()` / `desc()` | ordered handles only | ascending / descending ordering |

A bool handle exposes only `eq`, `ne`, and `in`; an ordering or range method on a bool handle is a compile-time error. The comparison method names are exactly `eq` / `ne` / `lt` / `le` / `gt` / `ge` / `in` / `between` — `gteq` is not used. Predicates compose through the function-form combinators `and(a, b)` / `or(a, b)` / `not(a)`.

```mst
scope youngAdults(maxAge: int) {
  return self
    .where(fn(row) => and(row.age.ge(20), row.age.le(maxAge)))
    .orderBy(fn(row) => row.age.asc())
    .thenBy(fn(row) => row.name.asc())
}
```

### Evaluation Model

A scope is a lazy query-plan builder. A scope call does not scan records: it returns a relation that carries the accumulated plan. The records are touched only when a terminal (such as `toList()`) resolves the relation against the active dataset, and they are always the master's **final, post-filter** records — a scope never runs against source/pre-filter records. A scope chain inlines into a single plan: `self.adult().gendered(g)` is equivalent to applying `adult`'s stages then `gendered`'s stages to one relation. See [evaluation.md](../language/evaluation.md#scope-evaluation).

### Validation Interaction

A master's [validation](validation.md) `all` rule binds the post-filter collection to `table` (and its alias `self`). That binding is a relation, so an `all` rule may call the master's scopes on `table` (or `self`). A validation `each` rule binds `row` to a `Record<M>`, which has no relation surface, so scopes are not callable from an `each` rule. Scopes remain unavailable from filter rule bodies; a master static body may call scopes through the relation entrypoint.

## Projections

A `select Name { ... }` section ([schema.md#select-section](schema.md#select-section)) emits a projected relation type alongside the source relation. The chain `Items.Select<Name>()` returns a relation typed on the projected record; `Where`, `OrderBy`, `ThenBy`, `Skip`, `Take` on the projected relation operate on the projected record type.

Projection runs at terminal time after the source-side plan has filtered, sorted, skipped, and taken the source records. The runtime copies the named fields into a fresh projected record for each surviving source record, then applies the projected plan to the resulting list. The relation's underlying record set is never reshaped — projection is a view that materialises on each terminal call.

Projected relations expose the same terminals as source relations except for `FindBy`. They do not redeclare the input record's primary-key columns, so a primary-key lookup is not defined.

## Joins

A `ref<Target>` field on a master's record emits a joined relation alongside the source relation ([schema.md#join-operator](schema.md#join-operator)). The chain `B.Join<Field>(A)` returns a relation typed on a `(Left, Right)` pair record where `Left` is the source record and `Right` is the target record.

Phase 4 implements **INNER JOIN** semantics: a left record whose ref does not resolve against the right relation is dropped from the pair sequence. The right relation is supplied at the call site as a relation value (typically the package-level relation for the target master).

Source-side stages flow into the join: `B.Where(...).OrderBy(...).JoinARecord(A)` runs the source-side filter / sort first, then for each surviving left record looks up the matched right record through the right relation's primary-key lookup. Pair-level stages added after the `Join<Field>` call apply to the pair sequence.

Joined relations expose the same terminals as source relations except for `FindBy`.

## Iteration

The Masterbelt surface program iterates master records through the `toList()` method:

```mst
for item in Items.toList() {
  use(item.id, item.name)
}
```

`Items.toList()` is the source-level surface form. Codegen lowers the call onto each target's relation list terminal (Go `Items.ToSlice(ctx)`; C# `Items.ToList(data, cancellationToken)`; TypeScript `items.toArray(data, signal)`). When the iteration appears inside the owning master's own static body, the receiver is the surrounding relation method's `r` parameter so a filter chained onto `r` before reaching the iteration is honoured.

The method carries the `asyncable`, `cancellable`, and `failable` effects per [Effect Inheritance](../language/types.md#effect-inheritance); the source program never has to acknowledge the inheritance. Iteration over a master is observational: a subsequent call returns the same record set as the first call within one run.

The legacy `all()` method name is no longer supported. Source programs migrating from the previous API must rename `Master.all()` to `Master.toList()`.

A master's [validation](validation.md) `all` rule binds the master's post-filter record collection to `table` (and to its alias `self`). The rule iterates it the same way, with `for row in table`, which is equivalent to iterating `table.toList()`; a rule that needs another master's post-filter records reaches them through `Master.toList()`. The only source-level relation **terminal** is `toList()`: `findBy` and the other terminals stay codegen/runtime-level. The relation **stage** operators (`where`, `orderBy`, `thenBy`, `skip`, `take`) are reachable from source only inside a [scope body](#source-level-relation-queries), where they build a relation plan rather than execute it.

## Backend Abstraction

The runtime separates the plan from its executor. Today the only executor is in-memory: it resolves records through `MasterData`, evaluates each predicate's accessor closure against every candidate row, sorts the survivors, and applies skip / take.

The plan is intentionally structural so a non-memory backend can consume it without executing the in-memory accessor closures. A SQLite backend (a deferred phase) would:

- inspect `QueryPlan.Source.Name` to route to the corresponding table;
- walk the predicate AST and translate each concrete node (`EqPredicate{Field, Value}`, `BetweenPredicate{Field, Low, High}`, `AndPredicate{Operands}`, ...) into SQL;
- walk the ordering AST into `ORDER BY` clauses;
- emit `OFFSET` / `LIMIT` from `Skip` / `Take`;
- execute the SQL and materialise the result into the record type.

The public API stays unchanged: a backend swap is invisible to the source program. Generated relation types and terminals do not depend on the executor; they construct plans and invoke the executor through a per-target dispatch seam.

This document does not specify the SQLite backend. The plan shape exists today so the future backend can land without breaking existing source programs.
