# Master Data Validation A master's `validation` section declares data quality checks that run over the master's records after import and filtering. Unlike a [filter](schema.md#filter-section), a validator never drops a record: it inspects the post-filter dataset and emits diagnostics. Validation runs during [`masterbelt export`](../tooling/cli.md) before any artifact is written; an error-severity validation failure blocks the entire export. ## Surface Form The validation section is an optional [master body section](schema.md#body-sections). It contains one or more scope groups; each group contains one or more named `validate` rules whose bodies assert conditions: ```mst master Records { record { primary ID: int, Name: string, Value: int } validation { each { validate nameRequired { assert row.Name != "" } validate valuePositive { assert row.Value > 0 } } all { validate checkValueSum { let total = 0 for row in table { total = total + row.Value } assert total < 1000 } } } } ``` - `validate` takes a **stable identifier**, not a message string. The identifier names the validator in project configuration (see [Severity Configuration](#severity-configuration)). It must be unique within a master across both `each` and `all` groups. - A single `validate` block may contain several `assert` statements. Each failed `assert` produces one diagnostic. - A validation block needs no `return`. It is a statement block that "passes" when it runs to completion. `return` inside a validation block is rejected (`masterbelt.checker.return_in_validation`). - `assert` is the validation primitive. It is only valid inside a validation block; the checker rejects `assert` elsewhere (`masterbelt.checker.assert_outside_validation`). ## Scopes ### `each` An `each` group runs every rule once per final record. The current record is bound to two implicit names: - `row` — the record. - `self` — an alias for the same record. Both have the master's record type. A rule that fails on one record continues to the next; a failed `assert` inside a rule does not stop later statements or asserts in the same rule. ### `all` An `all` group runs every rule once per master over the whole post-filter record collection. The collection is bound to two implicit names: - `table` — the post-filter relation. - `self` — an alias for the same relation. Both have type `Relation` for the surrounding master `M`. A rule iterates the collection with `for row in table` (or `for row in self`) — iterating a relation yields its post-filter records in plan order — and may reach other masters' post-filter records through [`Master.toList()`](query.md#iteration). Because the binding is a relation, an `all` rule may also apply the master's [scopes](schema.md#scope-section) and the relation stage operators to `table` or `self`. ## Execution Semantics Validation runs in a deterministic order: 1. Every source record is imported. 2. Each master's [filter](schema.md#filter-section) is applied. 3. The final record set for each master is built. 4. Validators run in module and source declaration order. `each` rules run in source order, preserving the post-filter record order; `all` rules run in source order. 5. Diagnostics are returned. A validation rule body runs in the evaluator, not in generated code: validators are a build-time contract over the data, and no validation code is emitted into any target language. ## Failure Severity A failed `assert` produces a `masterbelt.validation.assert_failed` diagnostic. Its severity defaults to **error**. Project configuration can override the severity per `(master, validator)` pair to `warning`; see [Severity Configuration](#severity-configuration). - An error-severity failure blocks the export: no artifact is written. - A warning-severity failure is reported but does not block the export. Records are never removed by validation: the failing record is preserved in the export (when the export proceeds). An evaluation error inside a validation rule (an unbound reference, a runtime type error, a division by zero) is a hard error attributed to the rule, surfaced through the underlying evaluator diagnostic or wrapped as `masterbelt.validation.evaluation_failed`. ## Severity Configuration Project configuration keys severity overrides by the **entrypoint-visible master path** and validator ID. See [tooling/configuration.md](../tooling/configuration.md#validators) for the schema. The path is the master as it is visible from the entry module — an aliased import uses the alias, and a nested master uses its dotted path — never the flattened codegen name. ```yaml validators: Records: nameRequired: warning valuePositive: error U.Friendships: uniquePair: warning ``` A master is only validated when it is reachable from the entry module: - A master declared in the entry module, or re-exported from it by a single `pub` import, is **visible** and validated under its entrypoint path. - A master that the entry module neither declares nor re-exports is **out of scope**: it has no config-visible name, so its validators do not run. - A master re-exported from the entry under more than one name is **ambiguous**: no single config path identifies it, so its validators do not run and the ambiguity is reported as `masterbelt.validation.ambiguous_master`, which blocks the export. (Two distinct masters re-exported under the *same* name are rejected earlier as `masterbelt.resolver.duplicate_name`.) Only `error` and `warning` are accepted. Configuration is validated before validators run, so a typo is visible even when a master imported zero records: - A master path that matches no master is `masterbelt.validation.config_unknown_master`. - A validator ID that matches no rule under a known master is `masterbelt.validation.config_unknown_validator`. - A severity outside `error` / `warning` is `masterbelt.validation.config_invalid_severity`. Each of these config diagnostics blocks the export. ## Diagnostics | Code | Severity | Meaning | | --- | --- | --- | | `masterbelt.validation.assert_failed` | configured (default error) | An `assert` condition evaluated false. The span is the condition expression; args carry the master path, validator ID, scope, record description, and condition source text. | | `masterbelt.validation.evaluation_failed` | error | A validation rule body raised an evaluation error. | | `masterbelt.validation.config_unknown_master` | error | A `validators` config key names a master that does not exist. | | `masterbelt.validation.config_unknown_validator` | error | A `validators` config key names a validator that does not exist on a known master. | | `masterbelt.validation.config_invalid_severity` | error | A `validators` severity is not `error` or `warning`. | | `masterbelt.validation.ambiguous_master` | error | A master is re-exported from the entry under more than one name, so no config path identifies it unambiguously. | For an `each` failure, the record is described by its primary key (the same convention used by [filter exclusion diagnostics](schema.md#filter-section)); for an `all` failure, the record description is ``. ## Future Work The MVP does not implement PowerAssert-style display of sub-expression values, custom validation messages, the `info` / `hint` severities, or target-language runtime validators. The evaluator boundary already retains each failed assertion's expression and span so PowerAssert-style reporting can be added without a surface change.