# Master Data Validation

A master's `validation` section declares data quality checks that run over the master's records after import and filtering. Unlike a [filter](schema.md#filter-section), a validator never drops a record: it inspects the post-filter dataset and emits diagnostics. Validation runs during [`masterbelt export`](../tooling/cli.md) before any artifact is written; an error-severity validation failure blocks the entire export.

## Surface Form

The validation section is an optional [master body section](schema.md#body-sections). It contains one or more scope groups; each group contains one or more named `validate` rules whose bodies assert conditions:

```mst
master Records {
  record { primary ID: int, Name: string, Value: int }

  validation {
    each {
      validate nameRequired {
        assert row.Name != ""
      }

      validate valuePositive {
        assert row.Value > 0
      }
    }

    all {
      validate checkValueSum {
        let total = 0
        for row in table {
          total = total + row.Value
        }
        assert total < 1000
      }
    }
  }
}
```

- `validate` takes a **stable identifier**, not a message string. The identifier names the validator in project configuration (see [Severity Configuration](#severity-configuration)). It must be unique within a master across both `each` and `all` groups.
- A single `validate` block may contain several `assert` statements. Each failed `assert` produces one diagnostic.
- A validation block needs no `return`. It is a statement block that "passes" when it runs to completion. `return` inside a validation block is rejected (`masterbelt.checker.return_in_validation`).
- `assert` is the validation primitive. It is only valid inside a validation block; the checker rejects `assert` elsewhere (`masterbelt.checker.assert_outside_validation`).

## Scopes

### `each`

An `each` group runs every rule once per final record. The current record is bound to two implicit names:

- `row` — the record.
- `self` — an alias for the same record.

Both have the master's record type. A rule that fails on one record continues to the next; a failed `assert` inside a rule does not stop later statements or asserts in the same rule.

### `all`

An `all` group runs every rule once per master over the whole post-filter record collection. The collection is bound to two implicit names:

- `table` — the post-filter relation.
- `self` — an alias for the same relation.

Both have type `Relation<M>` for the surrounding master `M`. A rule iterates the collection with `for row in table` (or `for row in self`) — iterating a relation yields its post-filter records in plan order — and may reach other masters' post-filter records through [`Master.toList()`](query.md#iteration). Because the binding is a relation, an `all` rule may also apply the master's [scopes](schema.md#scope-section) and the relation stage operators to `table` or `self`.

## Execution Semantics

Validation runs in a deterministic order:

1. Every source record is imported.
2. Each master's [filter](schema.md#filter-section) is applied.
3. The final record set for each master is built.
4. Validators run in module and source declaration order. `each` rules run in source order, preserving the post-filter record order; `all` rules run in source order.
5. Diagnostics are returned.

A validation rule body runs in the evaluator, not in generated code: validators are a build-time contract over the data, and no validation code is emitted into any target language.

## Failure Severity

A failed `assert` produces a `masterbelt.validation.assert_failed` diagnostic. Its severity defaults to **error**. Project configuration can override the severity per `(master, validator)` pair to `warning`; see [Severity Configuration](#severity-configuration).

- An error-severity failure blocks the export: no artifact is written.
- A warning-severity failure is reported but does not block the export.

Records are never removed by validation: the failing record is preserved in the export (when the export proceeds).

An evaluation error inside a validation rule (an unbound reference, a runtime type error, a division by zero) is a hard error attributed to the rule, surfaced through the underlying evaluator diagnostic or wrapped as `masterbelt.validation.evaluation_failed`.

## Severity Configuration

Project configuration keys severity overrides by the **entrypoint-visible master path** and validator ID. See [tooling/configuration.md](../tooling/configuration.md#validators) for the schema. The path is the master as it is visible from the entry module — an aliased import uses the alias, and a nested master uses its dotted path — never the flattened codegen name.

```yaml
validators:
  Records:
    nameRequired: warning
    valuePositive: error
  U.Friendships:
    uniquePair: warning
```

A master is only validated when it is reachable from the entry module:

- A master declared in the entry module, or re-exported from it by a single `pub` import, is **visible** and validated under its entrypoint path.
- A master that the entry module neither declares nor re-exports is **out of scope**: it has no config-visible name, so its validators do not run.
- A master re-exported from the entry under more than one name is **ambiguous**: no single config path identifies it, so its validators do not run and the ambiguity is reported as `masterbelt.validation.ambiguous_master`, which blocks the export. (Two distinct masters re-exported under the *same* name are rejected earlier as `masterbelt.resolver.duplicate_name`.)

Only `error` and `warning` are accepted. Configuration is validated before validators run, so a typo is visible even when a master imported zero records:

- A master path that matches no master is `masterbelt.validation.config_unknown_master`.
- A validator ID that matches no rule under a known master is `masterbelt.validation.config_unknown_validator`.
- A severity outside `error` / `warning` is `masterbelt.validation.config_invalid_severity`.

Each of these config diagnostics blocks the export.

## Diagnostics

| Code | Severity | Meaning |
| --- | --- | --- |
| `masterbelt.validation.assert_failed` | configured (default error) | An `assert` condition evaluated false. The span is the condition expression; args carry the master path, validator ID, scope, record description, and condition source text. |
| `masterbelt.validation.evaluation_failed` | error | A validation rule body raised an evaluation error. |
| `masterbelt.validation.config_unknown_master` | error | A `validators` config key names a master that does not exist. |
| `masterbelt.validation.config_unknown_validator` | error | A `validators` config key names a validator that does not exist on a known master. |
| `masterbelt.validation.config_invalid_severity` | error | A `validators` severity is not `error` or `warning`. |
| `masterbelt.validation.ambiguous_master` | error | A master is re-exported from the entry under more than one name, so no config path identifies it unambiguously. |

For an `each` failure, the record is described by its primary key (the same convention used by [filter exclusion diagnostics](schema.md#filter-section)); for an `all` failure, the record description is `<table>`.

## Future Work

The MVP does not implement PowerAssert-style display of sub-expression values, custom validation messages, the `info` / `hint` severities, or target-language runtime validators. The evaluator boundary already retains each failed assertion's expression and span so PowerAssert-style reporting can be added without a surface change.
