# Masterbelt Full Specification > Full text export for LLM context. Prefer source URLs for citation. Canonical site: https://masterbelt.dev/ # Specifications Source: https://masterbelt.dev/spec-src/README.md # Specifications This directory defines the user-visible contracts for Masterbelt. Implementation packages under `internal/` must implement these specifications, not replace them. ## Specification Boundaries - `language/`: Masterbelt lexical structure, syntax, semantics, modules, names, types, built-ins, standard library, evaluation, and diagnostics. - `masterdata/`: master data schema, keys, relations, validation, query behavior, imports, and exports. - `codegen/`: generation model, runtime requirements, and target-language generation. - `tooling/`: project configuration, CLI, syntax highlighting, symbol tags, formatter, linter, LSP, and developer experience. - `compatibility.md`: compatibility, versioning, and deprecation policy. `masterdata/query` describes user-visible query behavior over master data collections. The expression language used inside queries belongs to `language/evaluation`. ## Dependency Shape The graph uses one arrow meaning only: `A --> B` means specification `A` depends on terms or behavior defined by specification `B`. It is not a runtime data-flow graph. ```mermaid flowchart TD Syntax[language/syntax] --> Lexical[language/lexical] Semantics[language/semantics] --> Syntax Modules[language/modules] --> Semantics Names[language/names] --> Modules Types[language/types] --> Names Types --> Builtins[language/builtins] Types --> Std[language/std] Evaluation[language/evaluation] --> Types Diagnostics[language/diagnostics] --> Syntax Diagnostics --> Names Diagnostics --> Types Schema[masterdata/schema] --> Types Keys[masterdata/keys] --> Schema Relations[masterdata/relations] --> Schema Validation[masterdata/validation] --> Schema Validation --> Query Validation --> Configuration Query[masterdata/query] --> Relations Query --> Evaluation ImportCSV[masterdata/import-csv] --> Schema ImportXLSX[masterdata/import-xlsx] --> Schema ExportJSON[masterdata/export-json] --> Schema ExportMsgpack[masterdata/export-msgpack] --> Schema ExportSQLite[masterdata/export-sqlite] --> Schema CodegenModel[codegen/model] --> Types CodegenModel --> Evaluation CodegenModel --> Validation Runtime[codegen/runtime] --> CodegenModel Go[codegen/golang] --> Runtime TypeScript[codegen/typescript] --> Runtime CSharp[codegen/csharp] --> Runtime Configuration[tooling/configuration] --> Modules CLI[tooling/cli] --> Configuration Formatter[tooling/formatter] --> Syntax Highlighting[tooling/highlighting] --> Syntax Tags[tooling/tags] --> Syntax Linter[tooling/linter] --> Diagnostics LSP[tooling/lsp] --> Syntax LSP --> Names LSP --> Types LSP --> Formatter LSP --> Highlighting LSP --> Tags LSP --> Linter ``` ## Authoring Rules - Specify behavior before implementing it under `internal/`. - Keep specifications focused on user-visible behavior and compatibility. - Do not use implementation package names as specification boundaries unless the name is also a user-visible concept. # Lexical Structure Source: https://masterbelt.dev/spec-src/language/lexical.md # Lexical Structure This document describes the currently implemented Masterbelt lexical structure. The lexical structure is intentionally minimal at this stage. Future token additions must extend this document before or together with implementation changes. Masterbelt source text is UTF-8. Invalid UTF-8 source text is a syntax error. Lexical EBNF specifications are normative. ## Identifiers Identifiers name declarations and type references. ```ebnf identifier = identifier_start { identifier_continue } ; identifier_start = "A".."Z" | "a".."z" | "_" ; identifier_continue = identifier_start | "0".."9" ; ``` ## Keywords The implemented keywords are `const`, `pub`, `type`, `use`, `from`, `as`, `readonly`, `writable`, `master`, `record`, `source`, `filter`, `include`, `exclude`, `validation`, `each`, `all`, `validate`, `assert`, `primary`, `static`, and `select`. The keywords `const`, `pub`, `type`, `use`, `from`, `as`, `readonly`, `writable`, `master`, `record`, `source`, `filter`, `include`, `exclude`, `validation`, `each`, `all`, `validate`, `assert`, `primary`, `static`, and `select`, and the literal words `null`, `true`, and `false`, are reserved and cannot be used as identifiers. The master [validation section](../masterdata/validation.md) keywords `validation`, `each`, `all`, and `validate` and the `assert` statement keyword are fully reserved words, consistent with the other section keywords (`record`, `source`, `filter`, `static`, `select`). The implicit validation bindings `row` and `table` are **not** reserved: they are ordinary identifiers the validation evaluator binds inside a rule body, so a program may still use those names elsewhere. Additional keywords reserved by specific declaration or statement forms (`enum`, `fn`, `asyncable`, `failable`, `cancellable`, `return`, `self`, `if`, `else`, `let`, `match`, `for`, `in`, `break`, `continue`, `fail`) are listed in [syntax.md](syntax.md). The master [scope section](../masterdata/schema.md#scope-section) keywords `scope` and `indexed` are **context keywords**: they are matched only at the scope-declaration position inside a master body and remain usable as ordinary identifiers everywhere else. The identifier `self` is fully reserved (it appears in the list above): it is the implicit relation receiver inside a scope body and the implicit record / collection binding inside a validation rule, and a program may not declare it as a binding in any context. ## Operator Tokens Masterbelt recognises the following operator tokens in expression positions. Their grammar productions are defined in [syntax.md](syntax.md); their evaluation surface is defined in [builtins.md](builtins.md). ```ebnf unary_operator = "!" | "+" | "-" ; binary_operator = "+" | "-" | "*" | "/" | "%" | "==" | "!=" | "<" | "<=" | ">" | ">=" | "&" | "|" | "^" | "<<" | ">>" ; ``` Operator tokens never appear inside identifiers. The lexer matches the longest applicable operator token at each position, so `<<` and `<=` are single tokens and not the concatenation of `<` with `<` or `=`. The reserved literal word `null` may appear as a built-in type name in type expression positions. The identifiers `list` and `map` are not reserved. They name built-in generic type constructors only when written with type arguments in a type expression position. ## Whitespace Whitespace is ignored between tokens. Whitespace is limited to the ASCII characters space, tab, line feed, carriage return, and form feed. Line feed and carriage-return line feed are both accepted as line endings. ## Null Literal The null literal is written as `null`. ```ebnf null_literal = "null" ; ``` ## Bool Literals Boolean literals are written as `true` and `false`. ```ebnf bool_literal = "true" | "false" ; ``` ## Integer Literals Integer literals support decimal, binary, octal, and hexadecimal notation. ```ebnf integer_literal = decimal_integer_literal | binary_integer_literal | octal_integer_literal | hexadecimal_integer_literal ; decimal_integer_literal = decimal_digit { digit_separator decimal_digit } ; binary_integer_literal = "0" ( "b" | "B" ) binary_digit { digit_separator binary_digit } ; octal_integer_literal = "0" ( "o" | "O" ) octal_digit { digit_separator octal_digit } ; hexadecimal_integer_literal = "0" ( "x" | "X" ) hexadecimal_digit { digit_separator hexadecimal_digit } ; digit_separator = { "_" } ; decimal_digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" ; binary_digit = "0" | "1" ; octal_digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" ; hexadecimal_digit = decimal_digit | "a" | "b" | "c" | "d" | "e" | "f" | "A" | "B" | "C" | "D" | "E" | "F" ; ``` Radix prefixes are case-insensitive for `b`, `o`, and `x`. Decimal literals may be zero-padded. For example, `000000` is a decimal integer literal. Digits may be separated by one or more underscores. Consecutive underscores are allowed between digits. For example, `100_000__00` is a decimal integer literal. Examples: ```mst 0 000000 100_000__00 0b00000 0B0101__10 0o00000 0O0123__70 0x00000 0Xdead__BEEF ``` ## String Literals A string literal is written between double quotes. An empty string literal is valid and evaluates to the empty string. String literals may not contain unescaped line endings. String literal contents are interpreted as UTF-8. Non-ASCII characters may appear unescaped between the surrounding double quotes. The supported escape sequences are: - `\"`: double quote - `\\`: backslash - `\n`: line feed - `\r`: carriage return - `\t`: tab - `\0`: null character ```ebnf string_literal = '"' { string_char } '"' ; string_char = unescaped_char | escape_sequence ; unescaped_char = ? any UTF-8 character except '"', '\', LF, and CR ? ; escape_sequence = '\"' | '\\' | '\n' | '\r' | '\t' | '\0' ; ``` An unterminated string literal is a syntax error. Examples: ```mst "" "hello" "quote: \"" "line\nbreak" "こんにちは" ``` ## Comments Masterbelt has line comments, block comments, and documentation comments. These are distinct lexical categories. ### Line Comments A line comment starts with `//` and continues until the end of the line. ```mst // line comment null ``` Line comments are named syntax nodes. They are ignored as separators but remain available to syntax queries and tooling. A token starting with `///` is a documentation comment, not a line comment. ### Block Comments A block comment starts with `/*` and ends with `*/`. ```mst /* block comment */ true ``` Block comments may span multiple lines. ```mst /* block comment */ 0xFF ``` Block comments are named syntax nodes. They are ignored as separators but remain available to syntax queries and tooling. Block comments do not nest. An unterminated block comment is a syntax error. ### Documentation Comments A documentation comment starts with `///` and continues until the end of the line. Documentation comments attach to the next declaration or statement. Multiple consecutive documentation comments may attach to the same declaration or statement. ```mst /// docs for null null /// docs for true /// second doc line true ``` Documentation comments are part of the syntax tree and are not treated as ordinary ignored comments. Documentation comments may only appear immediately before the declaration, statement, or grouped const item they document. A documentation comment with no following declaration, statement, or grouped const item is a syntax error. A documentation comment after another token on the same line is a syntax error. # Syntax Structure Source: https://masterbelt.dev/spec-src/language/syntax.md # Syntax Structure This document describes the currently implemented Masterbelt syntax structure. Lexical tokens are defined in [lexical.md](lexical.md). The syntax structure is intentionally minimal at this stage. Future grammar additions must extend this document before or together with implementation changes. ## Source Files A source file is a sequence of declarations and statements. ```ebnf source_file = { declaration | statement } ; declaration = { doc_comment } ( visible_declaration | use_declaration | reexport_declaration ) ; statement = { doc_comment } expression_statement ; visible_declaration = [ visibility_modifier ] ( const_declaration | type_declaration | enum_declaration | function_declaration | master_declaration ) ; visibility_modifier = "pub" ; ``` An empty source file is valid. Declarations may appear in any order. `use` and re-export declarations are not required to precede other declarations. Whitespace, line comments, and block comments are ignored between tokens as defined by the lexical structure. Documentation comments are not ignored tokens. They are valid only as declaration prefixes, statement prefixes, or item prefixes inside grouped const declarations. ## Declarations The implemented declaration forms are const declarations, type declarations, enum declarations, function declarations, master declarations, `use` declarations, and re-export declarations. ```ebnf const_declaration = "const" ( const_item | const_group ) ; const_group = "(" const_group_item { const_group_item } ")" ; const_group_item = { doc_comment } const_item ; const_item = identifier [ ":" type_expression ] "=" expression ; type_declaration = "type" identifier [ type_parameters ] "=" type_expression ; type_parameters = "<" type_parameter { "," type_parameter } [ "," ] ">" ; type_parameter = identifier ; enum_declaration = "enum" identifier [ ":" type_expression ] "{" [ enum_variants ] "}" ; enum_variants = enum_variant { "," enum_variant } [ "," ] ; enum_variant = { doc_comment } identifier [ "=" integer_literal ] ; function_declaration = { effect_modifier } "fn" identifier "(" [ function_parameters ] ")" [ ":" type_expression ] function_body ; function_body = function_block | function_arrow ; function_block = "{" { statement } "}" ; function_arrow = "=>" expression ; master_declaration = "master" identifier "{" { master_section } "}" ; master_section = master_record_section | master_source_section | master_filter_section | master_validation_section | master_static_section | master_select_section | master_scope_declaration | master_declaration ; master_record_section = "record" product_type ; master_source_section = "source" "{" { master_source_entry } "}" ; master_source_entry = master_source_kind string_literal [ master_source_options ] ; master_source_kind = identifier ; master_source_options = "{" [ master_source_option { "," master_source_option } [ "," ] ] "}" ; master_source_option = identifier ":" expression ; master_filter_section = "filter" "{" { master_filter_rule } "}" ; master_filter_rule = ( "include" | "exclude" ) string_literal function_block ; master_validation_section = "validation" "{" { master_validation_group } "}" ; master_validation_group = ( "each" | "all" ) "{" { master_validation_rule } "}" ; master_validation_rule = "validate" identifier function_block ; master_static_section = "static" "{" { master_static_member } "}" ; master_static_member = { doc_comment } [ visibility_modifier ] ( const_declaration | function_declaration ) ; master_select_section = "select" identifier "{" [ master_select_field { "," master_select_field } [ "," ] ] "}" ; master_select_field = identifier ":" identifier ; master_scope_declaration = [ visibility_modifier ] [ "indexed" ] "scope" identifier "(" [ function_parameters ] ")" ( function_block | scope_expression_body ) ; scope_expression_body = "=>" expression ; ``` A function declaration introduces a callable into the value binding space. The optional return type clause may be omitted when the body permits inference; parameter types may also be omitted when a surrounding context supplies them. Both forms are validated in [types.md](types.md). Statements inside a function block are either expression statements (evaluated for side effects) or return statements (which deliver the function's result and may carry an optional value). The optional visibility modifier makes the declaration visible outside its file. It is a declaration modifier, not a const-specific keyword. Documentation comments on a declaration must appear before the visibility modifier, if any. The const item type annotation may be omitted. In that case, the const type is inferred from the initializer expression. A const group must contain at least one item. Grouped const declarations share the outer visibility modifier. Documentation comments inside the group attach to the following const item. ## Use and Re-export Declarations ```ebnf use_declaration = "use" import_specifier "from" string_literal ; reexport_declaration = "pub" import_specifier "from" string_literal ; import_specifier = named_import_list | star_import ; named_import_list = "{" [ named_import { "," named_import } [ "," ] ] "}" ; named_import = identifier [ "as" identifier ] ; star_import = "*" ; ``` A `use` declaration brings the named symbols, or every public symbol when written with `*`, from the module identified by its source string into the current file's scope. A `use` declaration does not make the imported names visible outside the current file. A re-export declaration brings the same set of symbols into the current file's scope and additionally declares each imported name as a public symbol of the current file. A re-export is equivalent to a `use` followed by re-declaring each imported symbol with the public modifier; tooling may collapse this pair into one declaration. The `pub` modifier in a re-export declaration is the same keyword used for `pub const` and `pub type`. The token immediately following `pub` distinguishes the form: a `{` or `*` starts an import specifier, while `const` or `type` starts a visible declaration. The optional `as` clause renames an imported symbol within the current file. A renamed `use { Source as Local }` makes `Local` the file-scope identifier and `Source` the foreign module's identifier. The source string of a `use` or re-export declaration follows the module path rules defined in [modules.md](modules.md). Documentation comments may appear before a `use` or re-export declaration but do not attach to individual imported names. ## Type Expressions A type expression appears as a const item type annotation or as the right-hand side of a type declaration. ```ebnf type_expression = union_type | primary_type ; union_type = primary_type "|" primary_type { "|" primary_type } ; primary_type = named_type | generic_type | product_type | function_type ; named_type = identifier | reserved_type_identifier ; generic_type = identifier "<" generic_arguments ">" ; generic_arguments = type_expression { "," type_expression } ; product_type = "{" [ product_type_fields ] "}" ; product_type_fields = product_type_field { "," product_type_field } [ "," ] ; product_type_field = [ field_modifier ] identifier ":" type_expression ; field_modifier = "readonly" | "writable" | "primary" ; function_type = { effect_modifier } "fn" "(" [ function_parameters ] ")" ":" type_expression ; function_parameters = function_parameter { "," function_parameter } [ "," ] ; function_parameter = [ "*" ] identifier ":" type_expression ; effect_modifier = "asyncable" | "failable" | "cancellable" ; reserved_type_identifier = "null" ; ``` A type expression at a single position is one type expression. Parser implementations must produce one type expression node per type expression position, even when the surface form is a single name. The `|` operator left-associates. A surface form `A | B | C` is one flat union with members `A`, `B`, and `C`. ## Generic Type Declarations A type declaration may carry a list of type parameters. Type parameters are declaration-level: they belong to the type declaration itself and any type expression on the right-hand side may reference them. There is no special form for "generic product" or "generic union" — a parameter introduced on the declaration is visible to every nested type expression regardless of shape. ```mst type Box = { value: T } type Pair = { first: A, second: B } type Maybe = T | null type Items = list ``` Type parameter names are identifiers in the same identifier name space as the alias itself; inside the alias body they shadow any outer type binding of the same name. Type parameters are not in scope outside the declaration that introduces them. A type declaration declared with N type parameters must be applied with exactly N type arguments at every use site, written with the same `name` syntax used for built-in generic types. A bare reference to a generic alias without arguments is a use-site error. A type declaration declared without a parameter list is non-generic and cannot be applied with type arguments. ## Statements The implemented statement forms are expression statements, return statements, if statements, match statements, for statements, break statements, continue statements, fail statements, assert statements, local const/let bindings, and assignments. ```ebnf statement = expression_statement | return_statement | if_statement | match_statement | for_statement | break_statement | continue_statement | fail_statement | assert_statement | local_const_statement | let_statement | assignment_statement ; expression_statement = expression ; return_statement = "return" [ expression ] ; if_statement = "if" expression function_block [ "else" ( function_block | if_statement ) ] ; match_statement = "match" expression "{" { match_arm } "}" ; for_statement = "for" for_binding [ "," for_binding ] "in" expression function_block ; for_binding = identifier | "_" ; break_statement = "break" ; continue_statement = "continue" ; fail_statement = "fail" expression ; assert_statement = "assert" expression ; local_const_statement = "const" identifier [ ":" type_expression ] "=" expression ; let_statement = "let" identifier [ ":" type_expression ] "=" expression ; assignment_statement = identifier "=" expression ; ``` At this stage, statements have no explicit terminator. Return statements are valid only inside a function body; the type checker reports a use outside a function as `masterbelt.checker.return_outside_function`. The `assert_statement` is recognized by the grammar everywhere a statement is accepted, but it is only valid inside a [validation rule body](../masterdata/validation.md). The checker rejects an `assert` used elsewhere as `masterbelt.checker.assert_outside_validation`. ### If Statements An `if` statement chooses one of two branches based on a boolean condition. The condition is an expression that must be assignable to `bool`. A non-bool condition is reported as `masterbelt.checker.if_condition_non_bool`. The two branches share the surrounding function's scope. A `return` inside either branch terminates the surrounding function the same way a top-level `return` would. The `else` clause is optional and may take one of two shapes: - `else function_block` — a final alternative branch. - `else if_statement` — chained `if`/`else if` decisions. The chain is right-associative; an `else if` matches the innermost preceding `if` that does not already have an `else`. ```mst fn classify(value: int): string { if value < 0 { return "negative" } if value == 0 { return "zero" } return "positive" } ``` The keywords `if` and `else` are reserved and cannot be used as identifiers in any position. ### Match Statements A `match` statement chooses one of a fixed list of branches based on the shape of a subject value. Each branch (called an *arm*) carries a pattern, an optional guard expression, and a function block. ```ebnf match_arm = match_pattern { "|" match_pattern } [ "if" expression ] "=>" function_block ; match_pattern = type_pattern | enum_pattern | literal_pattern | product_pattern | wildcard_pattern ; type_pattern = type_expression [ "as" identifier ] ; enum_pattern = identifier "." identifier ; literal_pattern = null_literal | bool_literal | integer_literal | string_literal ; product_pattern = ( identifier | generic_type ) "{" [ product_pattern_fields ] "}" [ "as" identifier ] ; product_pattern_fields = product_pattern_field { "," product_pattern_field } [ "," ] ; product_pattern_field = identifier [ ":" match_pattern ] ; wildcard_pattern = "_" ; ``` The arm body is a function block; an expression-position `match` form is not part of this stage of the grammar. A `match` statement has at least one arm. A `match` statement with no arms is a syntax error. Each arm's pattern decides whether the arm matches the subject. The first arm whose pattern matches (and whose guard, if any, evaluates to `true`) is the only arm executed; the other arms are not evaluated. The `match` and `_` keywords are reserved and cannot be used as identifiers in any position. #### Pattern Forms - A **type pattern** matches when the subject's runtime type is assignable to the written type. The optional `as name` clause binds the narrowed value to a new local of the pattern's type inside the arm body. - An **enum pattern** matches when the subject equals the named variant. The pattern's two-segment form is the same surface form used by enum member access expressions. - A **literal pattern** matches when the subject equals the literal value. Literal patterns are written using the same literal forms as expressions. - A **product pattern** matches when the subject's runtime type is assignable to the named product type and every field sub-pattern matches the corresponding field. A product pattern always names its type prefix (bare `{ ... }` patterns without a type prefix are a syntax error). Field sub-patterns are written `field: pattern`. The short form `field` (no `: pattern`) is shorthand for `field: field`: it binds the field's value to a new local named after the field. - A **wildcard pattern** matches every value. A `|`-separated pattern list matches when any of its alternatives matches. Every alternative in one arm must be the same surface kind; the type checker further restricts what may appear (see [types.md](types.md#match-statements)). #### Guards and Body Scope When a pattern includes a binding (the `as name` clause, the short `field` form, or any nested binding), the binding is in scope inside the optional guard expression and inside the arm's function block. Bindings introduced by alternatives of a `|`-separated pattern list must agree in name and type across every alternative. A guard expression is a boolean expression evaluated after the pattern matches. When the guard evaluates to `false`, the arm is skipped and matching continues with the next arm. The arm body shares the surrounding function's scope. A `return` inside an arm body terminates the surrounding function the same way a top-level `return` would. ### For Statements A `for` statement walks the elements of an iterable subject and runs the body once per element. Each iteration introduces a fresh block scope so locals declared inside the body do not leak across iterations. ```ebnf for_statement = "for" for_binding [ "," for_binding ] "in" expression function_block ; for_binding = identifier | "_" ; ``` The `for`, `in`, `break`, and `continue` keywords are reserved and cannot be used as identifiers in any position. The wildcard `_` is shared with [match statements](#match-statements) and is reserved at every binding position. The subject expression's type drives the binding count and element types: - A `list` subject requires exactly one binding, typed `T`. - A `map` subject requires exactly two bindings, typed `K` and `V` in that order (the map's type-parameter order). - The standard-library helper `range(start, end)` ([builtins.md](builtins.md)) returns a `list` that the for statement treats specially (see [codegen/golang.md](../codegen/golang.md), [codegen/typescript.md](../codegen/typescript.md), [codegen/csharp.md](../codegen/csharp.md) for the lowering); the binding rules follow the `list` shape. - Master-collection iteration is reached through the master's `all()` method ([masterdata/schema.md](../masterdata/schema.md#iteration)) which yields `list`. A subject whose static type is a union containing `null` is not iterable on its own: the value must be narrowed to a list or map first (typically via a [match statement](#match-statements)). Iterating a non-iterable subject is rejected; iterating a possibly-null value is rejected with the same diagnostic. Bindings introduced by the for binding clause are visible inside the function block. A binding written as `_` skips that position without introducing a name. The two bindings of a map iteration may both be `_`; that form still type-checks but is rarely useful. Mutating the subject collection while iterating is unspecified behavior; the resulting iteration order and reachable element set are implementation-defined and may differ between target languages. ### Break and Continue `break` exits the innermost enclosing `for` body; `continue` skips the rest of the current iteration and proceeds with the next one. Both are bare keyword statements with no operand. ```ebnf break_statement = "break" ; continue_statement = "continue" ; ``` Using `break` or `continue` outside any `for` body is a type checking error; see [types.md](types.md#for-statements). ### Failable Handling A `failable` function declared with the `failable` effect modifier may complete with either the declared success type `R` or an `Error` value. The Error path is transparent at the surface: a call to a `failable` function is typed `R` exactly like a non-failable call, and propagation of an Error through the enclosing function is automatic. See [semantics.md](semantics.md#failable-handling) for the user-visible meaning and rationale. ```mst /// Failure-producing function. Inside the body, `return v` /// completes successfully and `fail expr` completes with an Error. failable fn parseInt(s: string): int { if s == "" { fail "empty input" } return 0 } /// Calling a failable function from another failable function: /// the surface form is an ordinary call. If parseInt fails, the /// surrounding sumParsed completes with that same Error /// automatically; the source program never mentions Error. failable fn sumParsed(a: string, b: string): int { let x = parseInt(a) let y = parseInt(b) return x + y } ``` A function that calls a `failable` function does not need to declare the `failable` modifier — the effect propagates silently. The same rule applies to every other effect (`asyncable`, `cancellable`); see [types.md](types.md#effect-inheritance). #### `fail` statement ```ebnf fail_statement = "fail" expression ; ``` `fail` is valid only inside the body of a `failable` function. The expression must evaluate to a `string` (sugar for `Error { message: }`) or to an `Error` value directly. The statement completes the enclosing function with the produced Error. Using `fail` outside a `failable` function is reported as `masterbelt.checker.fail_outside_failable`; a non-string-and-non-Error argument is reported as `masterbelt.checker.fail_unsupported_argument`. The `fail` keyword is reserved and cannot be used as an identifier in any position. ### Local Bindings and Assignment A function body may introduce local bindings with `const` and `let` statements: - `const x = expr` declares an immutable local. Reassigning it through an assignment statement is a type-checking error (`masterbelt.checker.assignment_to_const`). - `let x = expr` declares a mutable local that can be reassigned via the assignment statement form `x = expr`. Both forms accept an optional type annotation after the name. When omitted, the local's type is inferred from the initializer. When written, the initializer must be assignable to the annotated type (using the same rule as a `const` declaration's annotation). Locals are block-scoped. A binding declared inside a function block or an `if`/`else` branch is visible from its declaration until the end of the surrounding block. A local that shadows a binding from an enclosing scope is rejected (`masterbelt.checker.local_redeclaration`); reusing a name in a sibling (non-nested) block is allowed. Assignment looks the binding up through the enclosing block chain. Assigning to a name that is not visible in any enclosing scope is reported as `masterbelt.checker.assignment_to_unknown`. Assigning a value whose type is not assignable to the binding's declared type is reported as `masterbelt.checker.assignment_type_mismatch`. The local `const` statement is a distinct form from the top-level `const` declaration: the local form does not accept the parenthesised group, does not accept a visibility modifier, and does not expose its name outside the surrounding function. The keyword `let` is reserved and cannot be used as an identifier in any position. The `const` keyword is already reserved by top-level declarations. ```mst fn boundedAbs(x: int, limit: int): int { let value: int = x if value < 0 { value = -value } if value > limit { value = limit } return value } ``` ## Expressions The implemented expression forms are literal expressions, collection literals, and identifier references. ```ebnf expression = binary_expression ; binary_expression = unary_expression { binary_operator unary_expression } ; unary_expression = { unary_operator } primary_expression ; primary_expression = literal | collection_literal | identifier_expression | product_literal | typed_product_literal | cast_expression | member_access_expression | call_expression | function_expression ; literal = null_literal | bool_literal | integer_literal | string_literal ; collection_literal = "[" [ collection_items ] "]" ; collection_items = collection_item { "," collection_item } [ "," ] ; collection_item = expression [ ":" expression ] ; identifier_expression = identifier ; product_literal = "{" [ product_literal_fields ] "}" ; product_literal_fields = product_literal_field { "," product_literal_field } [ "," ] ; product_literal_field = identifier ":" expression ; typed_product_literal = ( identifier | generic_type ) product_literal ; cast_expression = ( identifier | generic_type ) "(" expression ")" ; member_access_expression = ( identifier | member_access_expression | call_expression ) "." identifier ; call_expression = expression "(" [ call_arguments ] ")" ; call_arguments = expression { "," expression } [ "," ] ; function_expression = { effect_modifier } "fn" "(" [ function_parameters ] ")" [ ":" type_expression ] function_body ; ``` The cast_expression form converts the inner value to the type written before the parentheses. The grammar permits either a bare identifier or a generic type application as the cast target; the type checker further restricts the target to a numeric type (or an alias whose body resolves to a numeric type). An identifier expression in value position refers to an existing value binding. The reserved literal words `null`, `true`, and `false` are matched as literal expressions, not as identifier expressions. The reserved keywords `const`, `pub`, `type`, `use`, `from`, and `as` are not valid identifier expressions. A collection literal is one of three forms decided by item shape: - All items written as a single expression: a list literal. - All items written as `expression ":" expression`: a map literal. - No items: an empty collection literal. The empty form has no syntactic discriminator; the type checker resolves it to a list or a map using the surrounding type annotation. Mixing item shapes is a syntax error. A trailing comma is allowed after the last item. Map entries with identical keys are not a syntax error. Their runtime semantics is last-wins. ### Unary and Binary Expressions A unary expression applies a prefix operator token to a primary expression. A binary expression applies an infix operator token between two operands. Both forms desugar to a method call on the operand (unary) or the left-hand operand (binary). The operator-to-method mapping and the intrinsic methods that the language provides for each built-in type are defined in [builtins.md](builtins.md). Operator precedence, highest to lowest: | Level | Operators | Associativity | |-------|----------------------------------|---------------| | 1 | unary `!` `+` `-` | n/a (prefix) | | 2 | `*` `/` `%` | left | | 3 | `+` `-` (binary) | left | | 4 | `<<` `>>` | left | | 5 | `<` `<=` `>` `>=` | left | | 6 | `==` `!=` | left | | 7 | `&` | left | | 8 | `^` | left | | 9 | `\|` | left | Parentheses around an expression are not currently part of the grammar; if multiple operators are mixed, precedence as listed above determines the parse. The `|` binary operator and the `|` between type expressions are lexically the same token. They do not conflict because a binary expression appears only in an expression position, while the `|` of a union type appears only in a type expression position. The `<` and `>` binary operators are similarly disambiguated from the `<` `>` of generic type applications and the `Type(value)` cast prefix: a generic type or cast prefix is only matched when the token sequence following the identifier is a parseable type-argument list closed by `>` and followed by `(` (cast) or `{` (typed product literal). In every other context, `<` and `>` are binary comparison operators. ## Product Types and Literals A product type denotes a record whose values are tuples of named fields. Field names within one product type must be unique; duplicates are a syntax error reported on the later occurrence. Each field may carry an optional modifier. The `readonly` modifier declares that the field is immutable after construction. The `writable` modifier declares the field as mutable; it is the explicit form of the default and exists so source can spell out the field's intent. A field with no modifier behaves the same as a `writable` field. The `primary` modifier is meaningful only inside a master declaration's record section; its semantics is defined in [../masterdata/keys.md](../masterdata/keys.md). Field modifiers are surface-level metadata: they do not affect type identity or assignability, but they influence the shape of generated code (see the codegen specifications for each target). The keywords `readonly`, `writable`, and `primary` are reserved and cannot be used as identifiers in any position. A product literal constructs a product value. The bare form `{ field: value, ... }` requires a surrounding annotation (such as a `const` item type annotation or a list/map element type) that supplies the expected product type. The typed-prefix form `TypeName { field: value, ... }` names the product type directly and does not require an outer annotation. A product literal must list every field declared in the resolved product type exactly once. Field order in the literal does not need to match the type's declaration order; structural equality of product types is independent of field order. ### Product Type Methods A product type may declare methods alongside its fields, separated by the same comma as fields. A method's surface form mirrors a top-level function declaration: `[effect ...] fn name(params)[: R] body`. Methods do not affect the product type's structural identity (see [types.md](types.md)); two product types whose field sets agree but whose method sets differ are still the same type structurally. A product type may declare multiple methods with the same name as long as their signatures differ — that is, parameter types differ at least in one position. Overload resolution is described in [types.md](types.md). ## Function Types A function type denotes the signature of a callable: an ordered parameter list, a return type, and an optional set of effect modifiers. Function types appear as type expressions and currently take no value-level form; they are used to annotate callable shapes for downstream consumers. ```mst type BinaryOp = fn(left: int, right: int): int type Mapper = fn(value: T): U type Sum = fn(initial: int, *values: int): int type Lookup = asyncable cancellable fn(key: string): string type Parser = failable fn(input: string): int ``` A function type's parameter list is comma-separated. Each parameter is written `name: T` and may be marked variadic by prefixing the parameter name with `*`. A trailing comma after the last parameter is allowed. An empty parameter list is allowed. A return type is mandatory; the colon and the return type expression must follow the closing parenthesis. Effect modifiers appear before the `fn` keyword and are separated by whitespace. The effect modifiers are `asyncable`, `failable`, and `cancellable`; their semantics are defined in [codegen/model.md](../codegen/model.md) and their target mappings are defined in each target's specification. Effects are a set: source order is not significant and duplicates collapse. The keywords `fn`, `asyncable`, `failable`, and `cancellable` are reserved and cannot be used as identifiers in any position. ### Variadic Parameters A parameter prefixed with `*` is variadic: the function accepts zero or more arguments of the declared element type at that position. Only the last parameter of a function type may be variadic; a `*`-prefixed parameter in any earlier position is a syntax error. Variadic parameters expand to a homogeneous sequence of the declared element type at every target language's call site. ## Enum Declarations An enum declaration introduces a nominal type whose values are a closed, ordered list of named variants. Each variant is backed by a numeric integer value fixed at compile time. ```mst pub enum Status { Active, Inactive } pub enum Priority: int32 { Low = 1, Normal = 5, High = 10 } enum Color { Red, Green = 5, Blue, } ``` The optional storage clause `: type_expression` names the integer type that backs every variant value. The storage type is a type expression so it accepts type declarations whose target is a built-in numeric type as well as the numerics directly; it must resolve to a numeric type after declared-type resolution. When the storage clause is omitted, the storage type is `int8`. An enum declaration must list at least one variant; declaring an enum with an empty body is a type checking error reported as `masterbelt.checker.enum_empty`. Variant names within one enum declaration must be unique; declaring two variants with the same name is a syntax error reported on the later occurrence. A variant value, if written, must be an integer literal. Any other expression is a syntax error. The keyword `enum` is reserved and cannot be used as an identifier in any position. ### Variant Value Assignment Variant values are assigned in source order: - A variant with an explicit `= integer_literal` takes that literal as its value. - A variant without an explicit value at the start of the declaration takes the value `0`. - A variant without an explicit value after at least one previous variant takes the previous variant's value plus one. Variant values are not required to be unique or monotonically increasing; if a variant with an explicit value collides with an earlier variant's value, both retain their declared values and no error is reported. Tooling consumers (linters, target indexes) may flag such collisions independently. Every assigned variant value must fit the resolved storage type's value range. A value outside the range is a lowering-time error reported with the standard integer-out-of-range diagnostic for the storage type. ## Master Declarations A master declaration introduces a named master data collection. The body is a brace-delimited list of sections; each section begins with a section-kind keyword (`record` or `source`) and carries the section's payload. ```mst master Items { record { primary id: int, name: string, } source { csv "data/items.csv" { separator: ",", } } } ``` The record section's payload is a product type expression: the same surface form that follows `=` in a `type Foo = { ... }` declaration. The `primary` field modifier is valid only inside a master record section; the language defines the modifier as a field-modifier keyword here, and [../masterdata/keys.md](../masterdata/keys.md) defines its semantics. The source section's payload is a brace-delimited list of source entries. Each entry begins with a source-kind identifier (for example `csv`), followed by a path expressed as a string literal, optionally followed by an option list. Source kinds are not reserved keywords: they are matched only at the source-kind position. The set of recognised source kinds is defined by the importer specifications under [../masterdata/](../masterdata/). The keywords `master`, `record`, `source`, and `primary` are reserved and cannot be used as identifiers in any position. The semantics of master declarations — primary keys, record element type, validation, and import semantics — are defined in [../masterdata/schema.md](../masterdata/schema.md). ## Member Access Expressions A member access expression names a member of a target identifier with the form `Target.Member`. ```mst const default: Status = Status.Active const aliceName: string = Alice.name ``` The target identifier resolves either to a value (then the access selects an instance member of the value's type) or to a type (then the access selects a static member of that type). The semantics of member lookup, including which types expose which members, are defined in [types.md](types.md). The target may itself be a member access (`Parent.Child.Member`) or a call expression (`target().member`). The call-target form supports [scope chaining](../masterdata/schema.md#scope-section), where a member is read off the relation returned by a scope call (`self.adult().gendered(g)`). A scope declaration's `=> expression` body and a function arrow body share the `"=>" expression` shape but are distinct grammar productions: a scope arrow body sits in the master-scope position and its expression must produce a `Relation` (see [schema.md#scope-section](../masterdata/schema.md#scope-section)). # Modules Source: https://masterbelt.dev/spec-src/language/modules.md # Modules Masterbelt source graphs have a single entrypoint file. - The entrypoint file is selected by project configuration. - The entrypoint file is the root used to discover files referenced by the project. - File identity is resolved internally as a full filesystem path. - User-facing paths are displayed relative to the project root. - Hash inputs and other stable user-facing identities use project-root-relative paths. - Source spans use project-root-relative paths. ## Imports A `use` or re-export declaration's `from` clause carries the module specifier of the foreign module. Import source strings are resolved as follows: - `from "file"` resolves `file` relative to the importing file and appends `.mst`. - `from "file.mst"` resolves `file.mst` relative to the importing file. - `from "file.ext"` is invalid when `ext` is not `.mst`. - Absolute file import paths are invalid. - `from "std:mod"` resolves the standard module `std.mod`. Standard modules are reserved for future use; until the standard library is implemented, a `std:` import reports a diagnostic. - The module name after `std:` must match `[a-z][a-z0-9_]*`. File imports are internally identified by their full filesystem path. The resolved full filesystem path is used internally for graph operations such as cycle detection. ## Module Graph The module graph is the set of files reachable from the entrypoint by following `use` and re-export declarations transitively. - The entrypoint is loaded first. Every module specifier appearing in a loaded file is then resolved and the target file is loaded, recursively. - A module specifier that does not resolve to an existing file is a project loading error reported at the importing site. - A standard-library specifier is rejected at this stage; future iterations will load standard modules from a built-in registry. - A module that imports itself transitively, including a direct self-import, forms a cycle and is rejected at loading time. The diagnostic is attached to the import that closed the cycle. The module graph is finite when no cycles exist. The order in which loaded modules become available to the type checker is a topological order: every module's imported modules are checked before the module itself. ## Cross-Module Resolution After all modules in the graph have been loaded and resolved, the type checker fills in the binding types for the import bindings introduced by `use` and re-export declarations. For a named import `use { X } from "..."`: - If the foreign module declares `pub const X` or re-exports `X` as public, the local value-space binding takes the foreign symbol's checked type. - If the foreign module declares `pub type X` or re-exports `X` as public, the local type-space binding takes the foreign symbol's resolved target type. - If neither space declares `X` as public in the foreign module, the import is a cross-module resolution error reported at the import site. - A named import that resolves in both spaces is valid; the value-space and the type-space bindings both carry types, and references in either position use the corresponding side. For a wildcard import `use * from "..."` or `pub * from "..."`: - The cross-module phase expands the wildcard into one binding per public symbol the foreign module exposes, under the foreign symbol's name. - Wildcard expansion happens before checker work for the importing file, so a reference to any expanded name resolves through the normal lookup path. - A name introduced by wildcard expansion may not collide with another binding in the importing file. The diagnostic is reported on the wildcard declaration. A re-exporting `pub` form makes the imported binding part of the current module's public surface. Cross-module lookups against the current module find the re-exported name. ## Cross-File Identifier Uniqueness Code generation targets may impose project-wide identifier uniqueness as a separate requirement. The language itself only enforces per-file uniqueness in [names.md](names.md); target-language documents specify when the wider constraint applies and how it is reported. # Names Source: https://masterbelt.dev/spec-src/language/names.md # Names This document defines the currently implemented Masterbelt name binding, scope, visibility, and reference resolution behavior. The name model is intentionally minimal at this stage. Future namespace, module, import, and reference additions must extend this document before or together with implementation changes. ## Scopes A scope is a region that can contain bindings. A source file creates one file scope. At this stage, file scopes do not have parent scopes. ## Binding Spaces A binding space is a category of names that are checked for uniqueness together within a scope. The implemented binding spaces are the value binding space and the type binding space. A value binding space contains names that refer to runtime values such as constants and, in the future, functions. A type binding space contains names that refer to types such as type declarations. Built-in type names are not bindings; they are language-level reservations that any type expression position may reference. A name may be bound in the value binding space and the type binding space of the same scope without conflict. ## Bindings A binding associates a name with a declaration in a binding space and scope. The implemented binding forms are const item bindings, type declaration bindings, and import bindings. Each const item introduces one binding in the value binding space of the containing file scope. Each type declaration introduces one binding in the type binding space of the containing file scope. Each named entry of a `use` or re-export declaration introduces one binding in both the value and the type binding space of the containing file scope under the entry's local name. The binding records the originating module source string and the foreign symbol name so later phases can resolve the reference once the foreign module is available. Whether the symbol exists in the value space, the type space, or both is determined at cross-module resolution; the binding's presence in both spaces lets a reference reach it from either position without re-parsing the import. A re-exporting declaration adds the public flag to each introduced import binding. A `use *` or re-export `*` declaration does not introduce per-name bindings; it records a wildcard request that later phases expand once the foreign module is loaded. Duplicate-name conflicts caused by wildcard expansion are detected at the cross-module resolution phase, not at file resolution. ```mst const A = 1 const ( B = 2 C = 3 ) type ID = int use { D, E as F } from "./other.mst" pub * from "./shared.mst" ``` This source file introduces the value bindings `A`, `B`, `C`, `D`, and `F` and the type bindings `ID`, `D`, and `F` in the file scope, plus a wildcard re-export from `./shared.mst` whose contents are expanded by the cross-module loader. ## Duplicate Names A scope cannot contain more than one binding with the same name in the same binding space. Duplicate bindings are name resolution errors. The diagnostic is reported on the later binding name. When duplicate bindings occur in a scope, the first binding is canonical and later duplicates are not exposed for reference resolution. ## Visibility A binding introduced by a `pub` declaration is public. Without `pub`, the binding is private to its file. For grouped const declarations, the outer visibility applies to every const item binding in the group. A type declaration may be `pub`. Public type bindings follow the same visibility model as public value bindings. A re-export declaration (`pub { ... } from "..."` or `pub * from "..."`) is public. The imported names become public bindings of the current file scope, available for cross-module lookup under their local names. A plain `use` declaration is not public. The imported names are private to the current file even when the foreign symbols are themselves public. ## References An identifier reference in expression position resolves to a binding in the value binding space of the enclosing scope. A reference must resolve to a binding that appears earlier in source order than the reference itself. Forward references, including a binding referencing itself, are name resolution errors. This restriction keeps lowering single-pass at the current stage and may be relaxed once topological dependency analysis is implemented. A reference that does not resolve to any value binding is a name resolution error. Type-position references to type declarations are resolved separately in the type binding space. The value and type binding spaces are independent. # Built-Ins Source: https://masterbelt.dev/spec-src/language/builtins.md # Built-Ins This document defines the currently implemented Masterbelt built-in types, values, functions, and operators. The built-in set is intentionally minimal at this stage. Future built-in additions must extend this document before or together with implementation changes. ## Built-In Types The implemented built-in primitive types are: - `null`: the type of the null literal. - `bool`: the type of boolean literals. - `string`: the type of string literals. The implemented built-in numeric types are: | Masterbelt | Signedness | Width | |------------|------------|----------------------------------------| | `int` | signed | native (host language's natural width) | | `int8` | signed | 8 bits | | `int16` | signed | 16 bits | | `int32` | signed | 32 bits | | `int64` | signed | 64 bits | | `uint` | unsigned | native (host language's natural width) | | `uint8` | unsigned | 8 bits | | `uint16` | unsigned | 16 bits | | `uint32` | unsigned | 32 bits | | `uint64` | unsigned | 64 bits | The default type of an integer literal that has no type annotation, pushdown target, or cast prefix is `int`. Built-in type names are written in lowercase. ### Numeric Type Classes The language distinguishes three internal numeric type classes. These classes are not types in their own right: they cannot appear in a type expression position and they do not name a binding. They exist to let the checker describe operations that require a numeric operand or target. - `numeric`: the union of every built-in numeric type listed above. - `snumeric`: the union of every signed built-in numeric type (`int`, `int8`, `int16`, `int32`, `int64`). - `unumeric`: the union of every unsigned built-in numeric type (`uint`, `uint8`, `uint16`, `uint32`, `uint64`). An alias whose target resolves to a numeric type belongs to the same class as the resolved target. The implemented built-in generic type constructors are: - `list`: a homogeneous sequence of values of element type `T`. - `map`: an associative mapping from key type `K` to value type `V`. Generic type constructors take their type arguments between `<` and `>` separated by `,`. Each type argument is itself a type expression. The number of type arguments must match the constructor: - `list` takes exactly one type argument. - `map` takes exactly two type arguments. Built-in type names and generic type constructor names are available only in type expression positions. They do not introduce value bindings. ## Built-In Product Types `Error` is the built-in product type used to represent a recoverable failure produced by a `failable` function. ```mst type Error = { message: string } ``` The `Error` name is reserved at the type binding space: declaring a type, enum, or master named `Error` is a type checking error reported through the standard reserved-name diagnostic. The only surface use of `Error` is as the argument to `fail` (`fail Error { message: "..." }`); it does not appear in function return types or in user-visible unions. See [syntax.md](syntax.md#failable-handling) and [semantics.md](semantics.md#failable-handling) for the integration with `fail`. ## Built-In Values The implemented built-in values are the literal words `null`, `true`, and `false`. ## Built-In Functions and Operators The implemented built-in free functions are: - `fn range(start: int, end: int): list` — yields the integers `start, start + 1, ..., end - 1` in ascending order. When `start >= end` the result is empty. The function is total: every input produces a value of `list` without effects. `range` is exposed as a value binding in the same module-scope name space as user-declared functions. The name `range` cannot be redeclared (a user declaration of `const range = ...` or `fn range(...)` is rejected through the [reserved built-in names](#reserved-built-in-names) rule extended to value-position built-ins). Although `range(start, end)` returns a `list` at the type level, every supported codegen target recognises a `for i in range(a, b) { ... }` source form and emits a counted loop instead of materializing the list (see [codegen/golang.md](../codegen/golang.md), [codegen/typescript.md](../codegen/typescript.md), and [codegen/csharp.md](../codegen/csharp.md)). The optimization is a target-emission concern; the language semantics treat `range` as returning a regular list. ### Operator Methods Every unary or binary operator in [syntax.md](syntax.md) is defined as a method on the operand (for unary) or the left operand (for binary). The surface form `a OP b` is equivalent to the method call `a.METHOD(b)`; the unary form `OP a` is equivalent to `a.METHOD()`. The complete operator-to-method mapping is: | Surface | Method | Arity | |---------|----------|--------| | `!a` | `not` | unary | | `+a` | `plus` | unary | | `-a` | `minus` | unary | | `a + b` | `add` | binary | | `a - b` | `sub` | binary | | `a * b` | `mul` | binary | | `a / b` | `div` | binary | | `a % b` | `mod` | binary | | `a == b`| `eql` | binary | | `a != b`| `neq` | binary | | `a < b` | `lt` | binary | | `a <= b`| `lteq` | binary | | `a > b` | `gt` | binary | | `a >= b`| `gteq` | binary | | `a & b` | `and` | binary | | `a | b` | `or` | binary | | `a ^ b` | `xor` | binary | | `a << b`| `lshift` | binary | | `a >> b`| `rshift` | binary | The method must exist on the operand's type. Type checking of an operator expression follows the same overload resolution rules used for any other method call (see [types.md](types.md)). A user product type may declare any subset of these methods to opt that operator in; declaring a method whose name is not in this table is permitted but has no operator interpretation. ### Built-In Fields on Primitive and Generic Types A few built-in types expose `readonly` fields that record a side-effect-free property of the value. The fields are not method calls — they participate in the regular `value.field` member-access form defined in [syntax.md](syntax.md) — so the surface form has no parentheses. - `string.length: int` — the number of Unicode codepoints in the string (rune count, not byte count). The value matches Go's `utf8.RuneCountInString`; each codepoint counts as one regardless of its UTF-8 byte size. - `list.size: int` — the number of elements in the list. - `map.size: int` — the number of entries in the map. The fields are not visible as free identifiers and cannot be redefined; they exist on every value of the corresponding type. An alias whose target resolves to one of these types inherits the same fields. ### Built-In Operator Methods on Primitive Types For every built-in primitive type listed earlier in this document, the language exposes a fixed set of operator methods as if the type carried them in its method list. These methods are not visible as free identifiers and cannot be redefined; they exist only to make operator expressions on primitive operands type-check. The intrinsic operator methods provided for each primitive type are: - Every numeric type `N` (`int`, `int8`, ..., `uint64`): - `fn add(other: N): N`, `fn sub(other: N): N`, `fn mul(other: N): N`, `fn div(other: N): N`, `fn mod(other: N): N` - `fn eql(other: N): bool`, `fn neq(other: N): bool`, `fn lt(other: N): bool`, `fn lteq(other: N): bool`, `fn gt(other: N): bool`, `fn gteq(other: N): bool` - `fn and(other: N): N`, `fn or(other: N): N`, `fn xor(other: N): N` - `fn lshift(other: N): N`, `fn rshift(other: N): N` - `fn plus(): N` (unary `+`) - `fn minus(): N` (unary `-`; provided on signed numeric types only) - `bool`: - `fn eql(other: bool): bool`, `fn neq(other: bool): bool` - `fn and(other: bool): bool`, `fn or(other: bool): bool`, `fn xor(other: bool): bool` - `fn not(): bool` (unary `!`) - `string`: - `fn eql(other: string): bool`, `fn neq(other: string): bool` - `fn lt(other: string): bool`, `fn lteq(other: string): bool`, `fn gt(other: string): bool`, `fn gteq(other: string): bool` - `fn add(other: string): string` (concatenation) - `null`: - `fn eql(other: null): bool`, `fn neq(other: null): bool` The built-in generic constructors expose operator methods as well: - `list`: - `fn add(other: list): list` — element-wise concatenation; the result lists the receiver's elements followed by the argument's elements. - `map`: - `fn add(other: map): map` — last-wins merge; the result contains every entry of the receiver and every entry of the argument, with a key present in both taking the argument's value. An alias whose target resolves to a primitive type or a built-in generic application inherits the corresponding intrinsic operator method set. Operator method calls on primitive operands are emitted by every supported target as that target's native operator. Operator method calls on built-in generic operands (list, map) are emitted as a call into the target's runtime helper file; see each target's specification for the surface translation and runtime file emission. # Types Source: https://masterbelt.dev/spec-src/language/types.md # Types This document defines the currently implemented Masterbelt type checking, inference, assignability, and overload behavior. The type system is intentionally minimal at this stage. Future type additions must extend this document before or together with implementation changes. Built-in types are defined in [builtins.md](builtins.md). ## Type Expressions A type expression denotes a type. Type expressions appear in const item type annotations and in type declarations. ```ebnf type_expression = union_type | primary_type ; union_type = primary_type "|" primary_type { "|" primary_type } ; primary_type = named_type | generic_type ; named_type = identifier | reserved_type_identifier ; generic_type = identifier "<" type_expression { "," type_expression } ">" ; ``` A named type references a built-in type or a user-declared type declaration by name. A generic type applies type arguments to a built-in generic type constructor. User-declared generic type constructors are not implemented at this stage. A union type denotes a value that has any one of its member types. Union types are flat: nested unions are flattened. Union members are deduplicated by type identity, and duplicate members are not an error. A union type with fewer than two distinct members is invalid syntax. A type expression with only one type denotation must be written without `|`. ## Type Identity Two types are the same type when they are structurally identical. - Two named types are identical when they have the same name. - Two generic types are identical when they have the same constructor and identical type arguments in order. - Two union types are identical when they have the same set of member types. - Two function types are identical when they have the same parameter types in order, the same variadic flag on the last parameter, the same return type, and the same effect set. - Two enum types are identical when they originate from the same enum declaration. Enum types are nominal: an enum type is not identical to any other enum type even when their storage types and variant lists coincide. Declared-type resolution applies before identity is determined. A declared type is identical to its resolved target type. An enum type is not unwrapped by declared-type resolution; it is its own type. Parameter names are not part of a function type's identity: `fn(a: int): int` and `fn(b: int): int` denote the same type. ## Literal Types Literal expressions have the following types: - `null` has type `null`. - `true` and `false` have type `bool`. - String literals have type `string`. Integer literals are typed contextually: - With a numeric target type pushed down from an annotation or another contextual position (such as a list element type), the literal adopts that target's type. Aliases of numeric types are preserved at the use site so a literal annotated with `type ID = int32` types as `ID`. - Without a numeric target, the literal defaults to `int`. A literal whose magnitude does not fit the resolved numeric type's value range is rejected at lowering time with `masterbelt.lowering.integer_out_of_range`. ## Cast Expressions A cast expression `T(value)` converts the inner value to the named target type. The target type T must resolve to a numeric type (a built-in numeric or an alias whose resolved body is numeric); a non-numeric target is a type checking error reported as `masterbelt.checker.cast_non_numeric_target`. The inner value must itself resolve to a numeric type; a non-numeric value is reported as `masterbelt.checker.cast_non_numeric_value`. The cast expression's result type is the cast target with its surface form preserved: `int32(42)` types as `int32` and `Level(10)` where `type Level = int64` types as `Level`. The inner expression is type-checked against the cast's target so integer literals inside the cast adopt the target's numeric kind through the normal pushdown mechanism. Cast expressions cannot widen the value beyond the target's range: the same `integer_out_of_range` rule that applies to annotated literals applies to a literal inside a cast. ## Collection Literal Types Collection literals are type-checked contextually. When a target type is known from a type annotation, the checker pushes the target down into the literal and checks each element. When no target is known, the checker infers a type from the elements. ### List Literals For a list literal `[e1, e2, ..., eN]`: - With target type `list`: each element type must be assignable to `U`. The result type is `list`. - Without a target type: the result type is `list`. ### Map Literals For a map literal `[k1: v1, k2: v2, ..., kN: vN]`: - With target type `map`: each `ki` must be assignable to `K` and each `vi` must be assignable to `V`. `K` must be a [Comparable](#comparable-types) type. The result type is `map`. - Without a target type: the result key type is `Union(type(k1), ..., type(kN))` and the result value type is `Union(type(v1), ..., type(vN))`. The key union must be Comparable. The result type is `map`. ### Empty Collection Literals A literal `[]` is an empty collection literal. - With target type `list`: the result type is `list`. - With target type `map`: `K` must be Comparable. The result type is `map`. - With any other target type, or without a target type: the empty collection literal is a type checking error. ### Pushdown and Generic Variance Generic types remain invariant for general assignability. Pushdown into a literal is a contextual rule on the literal expression itself: when an element of `[1, 2]` is checked against `int | string` because the surrounding annotation is `list`, each element succeeds by assignability to `int | string`. The resulting type takes the annotation, not the narrower element union. Pushdown applies only when the target type is the same generic shape as the literal (`list` for a list literal, `map` for a map literal). When the target type is a union or otherwise does not directly match the literal shape, the checker falls back to inferring the literal type independently and then checking assignability through union membership. ## Comparable Types A type is Comparable when it is one of: - The primitive types `null`, `bool`, or `string`. - Any built-in numeric type listed in [builtins.md](builtins.md) (signed or unsigned, native or fixed width). - A union type whose every member is Comparable. - A type declaration that resolves to a Comparable type. Generic types are not Comparable at this stage. Map key types must be Comparable. ## Type Annotations A type annotation names the expected type of the annotated declaration item. The annotation form is a type expression. An unknown named type is a type checking error. A built-in primitive type with type arguments is a type checking error. A generic type with the wrong number of type arguments is a type checking error. A non-generic named type with type arguments is a type checking error. ## Reserved Built-In Names Built-in type names cannot be used as type declaration names. Declaring a type declaration whose name matches a built-in primitive or generic constructor is a type checking error. After such an error, references to the name continue to resolve to the built-in. ## Source Order Preservation The type checker stores union member types in a canonical order. Tooling that renders type expressions back to source, such as the formatter, preserves the order written in source instead of the canonical order. ## Type Declarations A type declaration introduces a new type binding for an existing type expression. ```mst type ID = int type Number = int | string type Names = list ``` A type declaration name is a single identifier. Type declaration names live in the type binding space defined in [names.md](names.md). A type declaration's right-hand side may reference other type declarations. The checker resolves declared names by chasing through the chain to the final non-declaration target. A reference to a declared type retains the declaration's surface name at the reference site. Assignability and equality unwrap the surface wrapper so a value typed by the declared name is interchangeable with a value typed by its resolved target. Downstream consumers (code generation, indexes) use the surface name to render output that prefers the declared form over its resolved target. A type declaration that directly or indirectly references itself is a type checking error. A type declaration may be declared `pub`. Public type declarations follow the same visibility rules as `pub` const declarations. When a union type lists declared-type members, the canonical union form strips the surface wrappers and keeps only the underlying targets. Surface names survive only outside union construction. The union `int | ID` where `type ID = int` canonicalizes to the single type `int`. ### Generic Type Declarations A type declaration may declare a list of type parameters using the `` syntax described in [syntax.md](syntax.md). Type parameters belong to the type declaration itself rather than to any specific body shape; the declared body may reference them through any nested type expression — product, union, generic, or any combination thereof. ```mst type Box = { value: T } type Pair = { first: A, second: B } type Maybe = T | null type Items = list ``` A type parameter introduces an identifier in the type binding space that is visible only inside the body of the declaration that introduced it. Inside the body, a parameter name resolves to a distinct type variable; the variable is not a primitive, not equal to any user-declared type with the same surface name, and not assignable to any other type. Outside the declaration, the parameter is not in scope. A generic declaration with N parameters must be applied with exactly N type arguments at every use site, written `Name` with the same syntax used by built-in generic constructors such as `list`. A bare reference to a generic declaration without arguments is a type checking error. A non-generic declaration applied with type arguments is a type checking error. Applying a generic declaration substitutes each parameter with the corresponding argument throughout the declared body and yields the substituted target type. Equality and assignability of the resulting type follow the structural rules above on the substituted form; the declaration's surface name and argument list are preserved at the use site so downstream consumers (code generation, indexes) can render the surface form `Container` instead of the substituted body. A generic declaration may reference itself only by way of substitution at a use site. A directly or indirectly self-referential body (without an intervening type argument applied) is a type checking error in the same way as for non-generic declarations. ## Function Types A function type denotes a callable signature. Its surface form is described in [syntax.md](syntax.md); this section describes its semantics under the type system. A function type carries: - An ordered list of parameter types. - A boolean flag on the last parameter indicating whether it is variadic. - A return type. - A set of effect modifiers. Parameter names appear in source for documentation and tooling purposes only. They participate in scope and uniqueness checks at the declaration site (see below) but do not affect type identity or assignability. ### Parameter Name Uniqueness Within a single function type, every parameter name must be distinct. Declaring two parameters with the same name is a type checking error reported as `masterbelt.checker.duplicate_function_parameter`. ### Variadic Position A function type may carry at most one variadic parameter, and it must be the last parameter in the list. A `*`-prefixed parameter in any earlier position is a type checking error reported as `masterbelt.checker.variadic_not_last`. The element type of a variadic parameter is the type expression written after the colon; the function accepts zero or more arguments of that element type at that position. ### Effects The effect set on a function type is canonical: source order is not significant, and duplicates collapse. Two function types whose effect sets contain the same members are identical for the purposes of [Type Identity](#type-identity). The semantics of each effect are defined in [codegen/model.md](../codegen/model.md) and their target mappings are defined in each target's specification. ### Methods on Product Types A product type may carry an ordered list of methods alongside its fields. A method is a named callable bound to the product type; the method's signature is a function type (parameters, return type, effects). Methods do not contribute to the product type's structural identity: two product types whose fields agree but whose method sets differ are still the same type for the purposes of [Type Identity](#type-identity) and [Assignability](#assignability). A product type may declare multiple methods with the same name; the methods must differ in their parameter shapes. Two methods with the same name and identical parameter signatures are a type checking error reported as `masterbelt.checker.method_duplicate_signature`. A method is accessed as an instance member through `value.method` and invoked as `value.method(args)`. Overload resolution selects an applicable method: 1. Every applicable overload's parameter types must accept the call's argument types under [Assignability](#assignability). 2. An exact-type match (every argument's type equals the corresponding parameter type) is preferred over an assignable-but-not-exact match. 3. When no overload applies, the call is reported as `masterbelt.checker.overload_no_match`. 4. When multiple overloads apply and no exact match disambiguates, the call is reported as `masterbelt.checker.overload_ambiguous`. The result type of a method call is the chosen overload's return type. ### Implicit Receiver `self` Inside a method body the identifier `self` is in scope and refers to the implicit receiver value. Its checked type is the owning product type, so `self.field` resolves to one of the type's fields and `self.method(args)` invokes a method on the same type. Each target language renders the receiver according to its idiom: Go and TypeScript expose the receiver under the name `self`; C# rewrites `self` to its native `this` keyword. The identifier `self` is reserved at every declaration position. It is allowed in value-position identifier references and as the target of a member access expression solely so the implicit receiver remains writable inside a method body. A use of `self` outside a method body is a type checking error reported through the usual unknown-name diagnostic because no scope binds the name. ### Functions as Values A function declaration introduces a value of its function type into the value binding space. Calls against a function-typed value follow the same parameter-by-parameter assignability rule used for method calls, with no overload set: a function declaration produces exactly one signature. A function literal `fn(params)[: R] body` produces a function-typed value with no name. Parameter and return types may be inferred from a surrounding contextual type (the const annotation, an argument position, and so on). Without a contextual type the parameter and return types must be written explicitly; otherwise the checker reports `masterbelt.checker.function_parameter_missing_type` or `masterbelt.checker.function_missing_return_type`. ### Generic Function Types A function type may appear on the right-hand side of a [generic type declaration](#generic-type-declarations). Type parameters introduced by the declaration are in scope within every part of the function type's signature: parameter types, the variadic element type, and the return type may all reference them. Applying the declaration substitutes each parameter throughout the function type's signature. ```mst type Mapper = fn(value: T): U ``` ## Enum Types An enum type denotes a finite, ordered set of named variants. Each variant is backed by a numeric integer value fixed at compile time. The surface form of an enum declaration is described in [syntax.md](syntax.md); this section describes its semantics under the type system. An enum type carries: - A declaration-site name. - A storage type that is a built-in numeric type. - An ordered list of variants. Each variant has a name and an integer value. ### Storage Type The storage type is the type expression that follows the enum declaration's `:` clause. The type expression is resolved through declared-type resolution; the resolved target must be a built-in numeric type. A non-numeric resolved target is a type checking error reported as `masterbelt.checker.enum_non_numeric_storage`. When the storage clause is omitted, the storage type is `int8`. ### Variant Values Variant values are assigned in source declaration order by the rule in [syntax.md](syntax.md). Every assigned value must fit the storage type's value range. A value outside the range is a lowering-time error reported with the standard integer-out-of-range diagnostic for the storage type. Variant values are recorded as part of the enum type and exposed to downstream consumers (code generation, indexes). Two enum types with the same variant names but different values are distinct types under nominal identity. ### Type Identity Enum types are nominal: an enum type is identical only to itself (the type produced by the same enum declaration). It is not identical to its storage type and not identical to any other enum type, including ones with structurally identical variant lists. The nominal identity rule lifts to [Assignability](#assignability) below. ### Assignability An enum type is not assignable to any other type, including its storage type. To produce a value of the storage type from a variant, a [cast expression](#cast-expressions) `Storage(EnumName.Variant)` is required. Conversely, a value of the storage type is not assignable to an enum type. A program that needs to construct an enum value from an integer value writes the variant by name; integer-to-enum conversion is not provided. ## Member Access Every type exposes a (possibly empty) set of named members. A member belongs to one of two kinds: - An **instance member** is accessed through a value of the owning type: `value.member`. Instance members have a member type that is the type of the result. - A **static member** is accessed through the type itself: `Type.member`. Static members similarly carry the type of the resulting value. The members exposed by each type at this stage are: - A product type exposes one instance member per declared field. The member's name is the field name; the member's type is the field type. - An enum type exposes one static member per declared variant. The member's name is the variant name; the member's type is the enum type itself (a variant value is itself a value of the enum). - Every other built-in or user-declared type exposes no members at this stage. A type declaration's resolved body provides the member set: an [alias](#type-declarations) chases through to its target for instance-member lookup, so a value typed by an alias of a product type accesses fields through the alias surface name. Enum types are nominal and are not unwrapped; their static members are accessed directly through the enum's declaration name. ### Member Access Expression Resolution The expression `Target.Member` is resolved by: 1. Looking up `Target` first in the value binding space. If found, the access is an **instance member access** against the binding's checked type. 2. Otherwise, looking up `Target` in the type binding space. If found, the access is a **static member access** against the named type. 3. If `Target` resolves in neither space, the reference is reported as an unknown name. The checker then looks up `Member` on the resolved type. A member name with no matching declaration is reported as `masterbelt.checker.unknown_member`. Accessing an instance member through a type or a static member through a value is also reported as `masterbelt.checker.unknown_member`: a type's instance members do not exist as static members and vice versa. The result type of a member access expression is the member's declared type. ## Identifier Reference Types An identifier reference expression has the checked type of its resolved value binding. References are checked in source order. A reference whose binding has not yet been checked is a type checking error; this corresponds to the forward reference restriction defined in [names.md](names.md). ## Const Type Checking A const item initializer expression determines the initializer type. If a const item has no type annotation, the const item type is inferred from the initializer type. If a const item has a type annotation, the initializer type must be assignable to the annotated type. ```mst const A = true // A has type bool const B: bool = true // valid const C: string = 1 // type error const D: bool | int = 1 // valid; int is assignable to bool | int type ID = int const E: ID = 1 // valid; ID resolves to int ``` Grouped const declarations type check each item independently. ## Failable Handling A function declared with the `failable` effect produces values of its declared success type `R` at the type level. `Error` is not visible at the type level of a `failable` call site; calls to a `failable` function are typed `R`, identical to calls to a non-failable function with the same success type. See [semantics.md](semantics.md#failable-handling) for the user-visible meaning. `Error` is the built-in product type defined in [builtins.md](builtins.md#built-in-product-types). It appears in the type system only as a permitted argument to the `fail` statement. ### `fail` Statement `fail` is valid only inside a function whose declared effect set contains `failable`. Outside, the checker reports `masterbelt.checker.fail_outside_failable`. The argument is assignable to one of: - `string` — the runtime constructs `Error { message: }`. - `Error` — the value is used directly. Other argument types are reported as `masterbelt.checker.fail_unsupported_argument`. The argument is type-checked once with the union of permitted types as the contextual target. ### Effect Inheritance Every effect — `failable`, `asyncable`, `cancellable` — is inherited silently at call sites. A call to a function whose effect set is non-empty is always valid regardless of the surrounding function's declared effect set. The type checker emits no diagnostic for missing effect declarations; each codegen target observes the call graph and lifts the obligation into the rendered signature. The surface program may still declare effect modifiers on a function; they participate in type identity and in code generation the same way an inherited effect does. The declaration is a hint, not an obligation: a planner who writes `pub fn foo()` and calls an `asyncable` function inside is rendering the same target code as if `pub asyncable fn foo()` had been written. ### Return Type A `failable` function's body is checked against its declared return type `R`: `return value` requires `value` to be assignable to `R`, and the body type-checks as if there were no failure path. The `fail` statement is independent of `R`; it always completes the function with an Error and does not contribute to the return-position type check. ## For Statements The `for` statement is type checked against the static type of its subject expression. Type checking decides the binding count and element types, validates the loop body under a fresh scope, and tracks `break`/`continue` reachability. ### Subject Type After declared-type resolution the subject must satisfy one of: - A `list` shape — one binding, typed `T`. - A `map` shape — two bindings, typed `K` and `V` in source order. - A `Relation` shape — one binding, typed `M`'s record type. Iterating a relation yields its post-filter records in plan order, equivalent to iterating the relation's `toList()`. This branch is reached by the `table` / `self` binding of a validation `all` rule (see [Scope Bodies](#scope-bodies)). A subject whose resolved type is none of these shapes is reported as `masterbelt.checker.for_subject_not_iterable`. The diagnostic also fires for union subjects: a value typed `list | null` must be narrowed before iteration. The standard-library helper `range(start, end)` returns a `list` and therefore reaches the `list` branch automatically. Master-collection iteration relies on the master's `all()` method, which the checker resolves through its declared signature. ### Binding Count and Type The number of declared bindings must match the subject's shape: a `list` requires exactly one, a `map` requires exactly two. Any other count is reported as `masterbelt.checker.for_binding_count_mismatch` carrying the expected and actual counts. Each binding is added to the loop's fresh scope with the matching element type. A binding written as `_` is omitted from the scope but still consumes its position for counting purposes. Reusing a binding name that shadows an enclosing scope follows the regular [local redeclaration](#match-statements) rule. ### Body Scope The loop body is checked under a fresh scope that contains the bindings introduced above. Statements inside the body share the surrounding function's effect set and return type; a `return` inside the body terminates the surrounding function as usual. ### Break and Continue `break` and `continue` are valid only inside a `for` body. Using either outside any loop is reported as `masterbelt.checker.break_outside_loop` or `masterbelt.checker.continue_outside_loop`. Nested loops are handled by stacking the loop context: each statement resolves to the innermost enclosing `for`. ## Match Statements The `match` statement is type checked against the static type of its subject expression. Type checking ensures that every arm's pattern can match a value of the subject's type, that no two arms match the same value without a guard intervening, and that the arm list covers every value of the subject's type. ### Subject Type The subject expression's checked type is computed by the standard expression type checking rule. After declared-type resolution, the subject type is treated as a (possibly singleton) union: a non-union subject type is equivalent to a one-member union for the purposes of the rules below. ### Pattern Checking Each arm's pattern is checked against the subject type. - A **type pattern** `T` (optionally `T as name`) is valid when `T` is identical to one or more members of the subject's union after declared-type resolution. A `T` that is unrelated to every union member is reported as `masterbelt.checker.match_pattern_unrelated`. The `as name` clause introduces a new local binding of type `T` visible inside the arm; reusing a name that shadows an outer binding is reported as `masterbelt.checker.local_redeclaration`. - An **enum pattern** `E.V` is valid when the subject type contains the enum type `E` and `V` is one of `E`'s declared variants. Otherwise the pattern is reported as `masterbelt.checker.match_pattern_unrelated`. - A **literal pattern** is valid when the literal's type is assignable to the subject type under the regular [Assignability](#assignability) rule, and the subject type is one of `null`, `bool`, `string`, or a numeric type (or a union of those). A literal pattern against any other subject type is reported as `masterbelt.checker.match_pattern_unrelated`. - A **product pattern** `T { field: pattern, ... }` (optionally `... as name`) is valid when `T` is identical to one or more members of the subject's union after declared-type resolution and resolves to a product type. Every field name listed in the pattern must be declared by that product type; an unknown field is reported as `masterbelt.checker.match_pattern_unknown_field`. Each field sub-pattern is checked against the field's declared type. A short field form `field` is shorthand for `field: field`: it introduces a binding named after the field with the field's declared type. The optional `as name` clause introduces an additional binding of the product type `T`. - A **wildcard pattern** is valid against any subject type. A `|`-separated alternative list at one arm position is valid when every alternative is valid in isolation, every alternative is the same surface pattern kind (type, enum, literal, product, or wildcard), and the set of bindings introduced by each alternative agrees in name and type. A mismatched binding set is reported as `masterbelt.checker.match_alternative_bindings_mismatch`. ### Narrowed Types A matched pattern narrows the subject type for the duration of the arm's guard and body: - A type pattern `T` narrows the subject type to `T`. - A `|`-separated list of type patterns narrows the subject type to the union of the listed types. - An enum pattern narrows the subject type to the enum type that the variant belongs to. - A product pattern narrows the subject type to its prefix type `T`. - A literal pattern narrows the subject type to the literal's static type (`null`, `bool`, `string`, or the contextual numeric type). - A wildcard pattern does not narrow the subject type. When the subject expression is a plain identifier and the pattern introduces no explicit binding for that identifier, the identifier is rebound inside the arm to its narrowed type. The original binding is restored on exit from the arm. When the subject expression is not an identifier, an `as name` (or product `{ ... } as name`) clause is required to make the narrowed value reachable; without a clause the pattern still matches, but no narrowed binding is introduced. ### Guards A guard expression is type checked against the narrowed scope of its arm. The guard expression must be assignable to `bool`. A non-bool guard is reported as `masterbelt.checker.match_guard_non_bool`. An arm with a guard does not contribute to exhaustiveness: the checker assumes a guard may evaluate to `false`, so a guarded arm cannot, on its own, prove that a subject-type member is covered. ### Reachability An arm is **unreachable** when every value its pattern could match is already covered by an earlier unguarded arm. An unreachable arm is reported as `masterbelt.checker.match_unreachable_arm`. Two arms whose patterns match exactly the same set of values are unreachable past the first; the second occurrence is reported. ### Exhaustiveness The set of arm patterns must cover every value of the subject's static type. The checker computes the set of subject-type members not covered by any unguarded arm and reports remaining members through `masterbelt.checker.match_non_exhaustive`. The diagnostic carries the comma-separated list of the uncovered member types or enum variants. A wildcard pattern (`_`) covers every value of the subject type and therefore makes the match exhaustive. The checker reports `masterbelt.checker.match_wildcard_loosens` at warning severity when a wildcard pattern appears, since the wildcard widens the set of accepted values beyond what the static type can guarantee. The warning is informational and does not prevent the source from being accepted. A pattern alternative list (`P1 | P2 | ...`) covers the union of the values its alternatives cover. Literal alternatives in a single arm must all share the same static literal kind (every alternative is a bool literal, or every alternative is a string literal, or every alternative is a numeric literal); mixed-kind alternatives are reported as `masterbelt.checker.match_alternative_mixed_kinds`. ## Validation Blocks A master's `validation` section is type checked against the master's record type. The implicit bindings introduced inside a rule body depend on the rule's scope; the surface form is defined in [masterdata/validation.md](../masterdata/validation.md). ### Implicit Bindings - In an `each` rule, `row` and `self` both have the master's record type (a product type). `row.Field` resolves to a record field through the normal [Member Access](#member-access) rules. - In an `all` rule, `table` and `self` both have type `Relation` for the surrounding master `M` (see [Scope Bodies](#scope-bodies)). A `for row in table` statement iterates the relation's post-filter records in plan order — equivalent to iterating `table.toList()` — and binds `row` to the master's record type. Because the binding is a relation, an `all` rule may also call the master's scopes and stage operators on `table` (or its alias `self`). The bindings `row`, `table`, and `self` are immutable: they name the record or collection supplied by the validation evaluator and are not reassignable. `self` is bound the same value as `row` (in `each`) or `table` (in `all`). ### `assert` Statement An `assert` statement's condition must be assignable to `bool`. A non-bool condition is reported as `masterbelt.checker.assert_condition_non_bool`. `assert` is valid only inside a validation rule body; outside, the checker reports `masterbelt.checker.assert_outside_validation`. ### `return` `return` inside a validation block is a type error reported as `masterbelt.checker.return_in_validation`. A validation block has no return value; it reports failures through `assert`. ## Scope Bodies A master's [scope section](../masterdata/schema.md#scope-section) declares a named relation query. Each scope is type checked against the declaring master `M`. ### `Relation` `Relation` is the type of the master's query surface. It is reached at the source level only through scopes and the validation `all` binding; it is distinct from the master's record type and from `list`. The master's surface name denotes the base `Relation` entrypoint. `Relation` exposes the stage methods `where`, `orderBy`, `thenBy`, `skip`, `take` and the master's user-declared scopes; each returns a fresh `Relation`. It does not expose terminal operators or projection / join switches at the source level. The full operator vocabulary and the field-handle callback DSL are defined in [query.md](../masterdata/query.md#source-level-relation-queries). ### Implicit `self` Inside a scope body `self` is the receiver `Relation`. `self` is a reserved identifier in every context (see [lexical.md](lexical.md#keywords)); declaring or assigning it is rejected by the lexical reserved-identifier rule (`masterbelt.parser.reserved_identifier`). When a scope is reached through a scope chain, `self` is the relation produced by the earlier stages. ### Return Type A scope returns `Relation` for the declaring master. A block body must `return` such a value; a missing return is reported as `masterbelt.checker.scope_missing_return`. An arrow body's expression must produce such a value. A returned expression of any other type — `list`, a nullable, a projected relation, a joined relation, or an unrelated master's relation — is reported as `masterbelt.checker.scope_return_type_mismatch`. A scope never declares an explicit return type and never carries type parameters. ### Effects A scope body is effect-free: it builds a plan and must not inherit `failable`, `cancellable`, or `asyncable`. A scope body — or a query callback inside it — that calls a user function, static, or const carrying one of those effects is reported as `masterbelt.checker.scope_forbidden_effect`. A scope is therefore never failable / cancellable / asyncable, and terminal relation operators (which carry those effects) cannot be reached from a scope body. The effect rule extends [Effect Inheritance](#effect-inheritance) by treating any inherited forbidden effect inside a scope as an error rather than propagating it. ### Calls and Cycles A scope call resolves against the declaring master's `Relation` receiver only; a call on a receiver that does not expose the scope (another master's relation, a record, or a list) is reported as `masterbelt.checker.unknown_member`. Argument count and types are checked like a regular call and a mismatch is reported through the general call diagnostics. Scope resolution does not depend on source order, so a scope may chain a scope declared later in the same master. A scope that references itself directly or transitively is reported as `masterbelt.checker.cyclic_scope`; recursive scopes are forbidden. ## Assignability A source type is assignable to a target type when: - The source type and the target type are identical. - The target type is a union type and the source type is assignable to one of the target's member types. Generic types are invariant in their type arguments at this stage. `list` is not assignable to `list`. Assignability between declared types is determined after declared-type resolution. # Semantics Source: https://masterbelt.dev/spec-src/language/semantics.md # Semantics This document defines the currently implemented user-visible structural meaning of Masterbelt programs after syntax is parsed. The semantic model is intentionally minimal at this stage. Future semantic additions must extend this document before or together with implementation changes. ## Source Files A source file evaluates to ordered declarations and statements. Line comments and block comments have no semantic value. They remain available to syntax tooling but do not appear in the semantic program structure. ## Analysis Analysis parses a source file, resolves names, and type checks the resolved file. Analysis returns diagnostics from each implemented phase in phase order: syntax parsing, name resolution, then type checking. At this stage, analysis may continue after an earlier phase reports diagnostics when that phase produced a partial result. Later diagnostics are still reported, but callers must treat any error diagnostic as making the source unacceptable for operations that require a trustworthy checked program. ## Documentation Comments Documentation comments attach to the declaration or statement that immediately follows them in the syntax tree. Multiple documentation comments attached to the same declaration or statement preserve source order. Documentation comments inside grouped const declarations attach to the const item that immediately follows them. The text of a documentation comment is the source text after the leading `///`. When a documentation comment line ends with carriage-return line feed, the carriage return is not part of the documentation text. ## Visibility Modifiers A `pub` declaration is visible outside its source file. Without `pub`, a declaration is private to its file. ## Const Declarations A const declaration binds one or more immutable names to initializer expressions. The optional type annotation records the declared type for the const item. If no type annotation is written, the item type is inferred from the initializer expression. Grouped const declarations apply the outer visibility to every item in the group. Documentation comments inside the group are item documentation and preserve source order. ## Type Declarations A type declaration introduces a new name for a type expression. The declared type may carry zero or more type parameters; the body may be any nested type expression — product, union, generic, or any combination — and the declared name resolves through to that body at every use site. The declared name and the target type expression are both stored in the resolved declaration. A `pub` type declaration is visible outside its file. A type declaration does not introduce a value binding and does not evaluate to a runtime value. ## Failable Handling Masterbelt is a DSL whose primary surface users are planners, not engineers. Exception-style control flow is therefore not part of the surface language: a `failable` call from a `failable` body looks and types like an ordinary call, and the failure path is plumbed transparently by the implementation. A `failable` function may produce one of two outcomes per call: the declared success type `R`, or an `Error` value. The surface program never expresses the union directly: - A `failable` function's declared return type is exactly its success type `R`. Call expressions of a `failable` function are typed `R`. The `Error` outcome is not visible at the type level. - A `fail expression` statement completes the surrounding `failable` function with the Error outcome. When the expression is a `string`, the runtime constructs `Error { message: }`; when it is an `Error` value, that value is used directly. - A `return value` statement completes the function with the success outcome, carrying `value` (which must be assignable to `R`). - When a `failable` call inside a `failable` body produces an `Error`, the surrounding function completes with that same `Error` automatically. No surface syntax marks this propagation, and no surface syntax can intercept it. Every effect — `failable`, `asyncable`, `cancellable` — is inherited silently at call sites: a function that calls a callable carrying any effect behaves as if it carried the same effect, whether or not its own declaration mentions it. The surface program never has to express the obligation; each codegen target observes the call graph and lifts the obligation into the rendered signature (see [types.md](types.md#effect-inheritance)). A `match` statement does not receive an `Error` arm for a `failable` call subject, because the call's surface type is `R`. The built-in `Error` product type remains available only as the argument to `fail`. The evaluator and every codegen target observe these semantics through their native error-passing idioms (see [codegen/golang.md](../codegen/golang.md#failable-handling), [codegen/typescript.md](../codegen/typescript.md#failable-handling), and [codegen/csharp.md](../codegen/csharp.md#failable-handling)). Those idioms are an implementation concern; they must not leak into the surface program. ## For Statements A `for` statement evaluates its subject expression exactly once. The result is a sequence of elements decided by the subject's type: - A `list` value yields its elements in stored order. - A `map` value yields its entries in insertion order. Each entry contributes two values, a key and a value, in that order. - A `range(start, end)` call yields the integers `start, start + 1, ..., end - 1` in ascending order. When `start >= end`, no element is yielded. - A `master.toList()` call yields the master's records in import order. Records dropped by the master's filter section are not yielded. For each element, the bindings declared by the `for` clause are bound to the element's components in the order described above, then the function block is evaluated under a fresh scope. After the block completes, the next element is drawn; iteration ends when the subject is exhausted. Bindings written as `_` discard the corresponding component. The body cannot observe a `_`-bound value. A `break` statement terminates the iteration immediately and resumes after the `for` statement. A `continue` statement skips the remainder of the current iteration; the next element is drawn as usual. A `return` inside the body terminates the surrounding function the same way a top-level `return` would. Mutation of the subject collection from within the body is unspecified behavior. Targets may evaluate the iteration over a live or snapshot view; programs must not rely on either choice. ## Match Statements A `match` statement evaluates its subject expression exactly once, then selects at most one arm by walking the arms in source order: 1. The subject is tested against the arm's pattern. 2. When the pattern matches, the arm's guard expression (if any) is evaluated. 3. When the guard evaluates to `true` (or no guard is written), the arm's function block is evaluated. No subsequent arm is considered. 4. When the pattern does not match, or the guard evaluates to `false`, evaluation continues with the next arm. Pattern matching is value-based: a type pattern observes the subject's runtime type; an enum pattern, literal pattern, or product pattern compares the subject to a value or structural shape. Bindings introduced by a matched pattern (`as name`, the short product-field form, or a nested binding) take effect before the guard expression is evaluated and remain in scope for the entire arm body. A guard expression is an ordinary boolean expression; it may have side effects. The guard is evaluated at most once per arm and only when its arm's pattern matches. The set of arms is exhaustive over the subject's static type: type checking guarantees that at least one arm matches every value of the subject's type, except when a guard restricts the matched set (see [types.md](types.md#match-statements)). When every arm's guard evaluates to `false` and no later unguarded arm covers the value, evaluation falls through the `match` statement without executing any arm body. ## Validation Blocks A master's `validation` section declares named rules that run over the master's post-filter records during [`masterbelt export`](../tooling/cli.md), after import and filtering and before any artifact is written. A validation rule never drops a record; it inspects the data and emits diagnostics. The complete surface form, scoping, and severity model are defined in [masterdata/validation.md](../masterdata/validation.md). A validation rule body is a statement block that "passes" when it runs to completion. It must not contain a `return`: a validation block has no value and the checker rejects `return` inside it. The block reports failures through `assert` instead. An `all` rule binds the master's post-filter relation to `table` (and to its alias `self`). A `for row in table` statement iterates the relation's post-filter records the same way [For Statements](#for-statements) iterate a relation; because `table` is a relation, an `all` rule may also apply the master's scopes to it. ### Assert Statements An `assert` statement evaluates its condition expression, which must be a boolean. `assert` is the validation primitive and is valid only inside a validation rule body. A failing `assert` (a condition that evaluates to `false`) records one diagnostic and evaluation continues with the next statement. Unlike `fail` or `return`, an `assert` does not abort the surrounding block, so one rule can report several failures from several `assert` statements in a single pass. ## Scope Sections A master's `scope` section declares a named, parameterisable relation query that surfaces as a method on the master's relation. A scope body builds and returns a `Relation` without scanning records; it is effect-free and runs no terminal. The surface form, the `self` receiver, the relation query DSL, scope chaining, visibility, and the `indexed` modifier are defined in [masterdata/schema.md](../masterdata/schema.md#scope-section) and [masterdata/query.md](../masterdata/query.md#source-level-relation-queries); the type rules are defined in [types.md](types.md#scope-bodies); the evaluation model is defined in [evaluation.md](evaluation.md#scope-evaluation). ## Expression Statements An expression statement evaluates its expression. At this stage, expression statements do not bind names and do not produce declarations. ## Literals Literal expressions evaluate to their literal values. ### Null Literal `null` evaluates to the null value. ### Bool Literals `true` evaluates to the boolean true value. `false` evaluates to the boolean false value. ### Integer Literals Integer literals evaluate to integer values. Digit separators are ignored when determining the value. The representable range and type assignment of integer literal values are defined by the type system. Radix prefixes determine the base: - `0b` and `0B` use base 2. - `0o` and `0O` use base 8. - `0x` and `0X` use base 16. - Integer literals without a radix prefix use base 10, including zero-padded decimal literals. ### String Literals String literals evaluate to string values. Escape sequences are decoded when determining the string value. ## Collection Literals A list literal evaluates to an ordered sequence of element values. A map literal evaluates to a mapping from key values to value values. When two map entries have equal keys, the entry that appears later in the source replaces the earlier one. This rule applies to both literal and computed keys. An empty collection literal evaluates to an empty list or an empty map depending on the resolved type as determined by [Collection Literal Types](types.md#collection-literal-types). # Evaluation Source: https://masterbelt.dev/spec-src/language/evaluation.md # Evaluation This document will define Masterbelt expression evaluation and compile-time evaluation behavior. ## Validation Rules A master's `validation` rules run in the evaluator at [`masterbelt export`](../tooling/cli.md) time, over the post-filter records. No validation code is generated into any target language; validators are a build-time contract over the data. The surface form, scoping, and severity model are defined in [masterdata/validation.md](../masterdata/validation.md). The validation evaluator executes each rule body as a statement block: - An `assert` whose condition evaluates to `false` produces one diagnostic (`masterbelt.validation.assert_failed`) and block evaluation continues. A single rule body may therefore produce several diagnostics, one per failed `assert`. - An `assert` whose condition evaluates to `true` produces no diagnostic. - An evaluation error inside a rule body — an unbound reference, a runtime type error, a division by zero — is a hard error attributed to the rule, surfaced through the underlying evaluator diagnostic or wrapped as `masterbelt.validation.evaluation_failed`. Unlike a failed `assert`, a hard error stops the rule. An `each` rule body runs once per post-filter record; an `all` rule body runs once over the whole post-filter record collection. An `all` rule binds `table` (and its alias `self`) to the master's post-filter [relation](#scope-evaluation); iterating it with `for row in table` yields the post-filter records, and the rule may apply the master's scopes to it. ## Scope Evaluation A master's [scope](../masterdata/schema.md#scope-section) is a lazy query-plan builder, not an eager record scan. Evaluating a scope call constructs a relation value carrying the accumulated [query plan](ir.md#master-scopes) — predicates, orderings, skip, take — and returns it; no records are touched at the call. A scope chain (`self.adult().gendered(g)`) threads the relation through each stage, producing one combined plan. Records are observed only when a terminal resolves the relation against the active dataset (in the build-time evaluator this is iteration of an `all` rule's `table`, or a `toList()` reach), and the observed records are always the master's **final, post-filter** records — a scope never runs against source/pre-filter data. The relation is immutable and copy-on-write: a base relation can seed independent chains without aliasing. The in-memory evaluator and the generated SQLite runtime share the same scope semantics; a scope builds a backend-independent relation plan, and the executor (in-memory closures or translated SQL) consumes it. # Intermediate Representation Source: https://masterbelt.dev/spec-src/language/ir.md # Intermediate Representation This document defines the currently implemented Masterbelt intermediate representation (IR). The IR is intentionally minimal at this stage. Future IR additions must extend this document before or together with implementation changes. ## Purpose The IR is the boundary between the type checker and downstream consumers. Code generators, evaluators, and indexes consume the IR rather than the AST so they share a single normalized program model. ## Normalization An IR module reflects the following normalizations performed by lowering: - Every type expression is resolved through type declarations. Aliases do not appear in the IR; uses are replaced with their resolved target type. - Every literal value is decoded. Integer radix prefixes and digit separators are interpreted. String escape sequences are decoded. Source text is not retained. - Collection literal values record items in source order. Map literal duplicate keys are deduplicated using last-wins semantics: the first occurrence's position is kept and its value is replaced by the last occurrence's value. - Doc comments attached to declarations are preserved as text lines with the leading `///` removed. - Other comments (`//`, `/* */`) are not present in the IR. ## Projects An IR project is the lowered form of an entire Masterbelt program rooted at one entrypoint. - A project records the entrypoint's canonical path, every loaded module keyed by canonical path, and the module path order topologically sorted from no-dependency leaves to the entrypoint. - Codegen targets walk the project's order to emit one output per module. ## Modules An IR module is the lowered form of one Masterbelt source file. - A module has a name carrying the source file's canonical path, a source span covering the whole file, an ordered list of constants, an ordered list of type declarations, and an ordered list of re-exports. - Constants appear in the same order as their declarations in source. Grouped const declarations contribute one constant per item, in source order. - Type declarations appear in the same order as their declarations in source. Each declaration carries its source identifier, its public flag, the resolved target type after substitution of any chain of declared types, and an ordered list of declared type parameters. For a non-generic declaration the parameter list is empty and the target is the final resolved type; for a generic declaration the parameter list names the declared parameters in source order and the target is the template body whose type variables refer to those parameters. - Re-exports represent `pub { ... } from "..."` declarations: each entry carries the local name visible in this module, the canonical path of the foreign module, and the foreign symbol name. Re-exports do not introduce new values or types; they make the foreign symbol available under the listed local name in this module's public surface. - `use` declarations without `pub` do not appear in the IR; their effect is to rewrite identifier references inside this module's expressions into cross-module references during lowering. ## Constants A constant carries: - A name string matching the source identifier. - A public flag set by the `pub` declaration modifier. - An ordered list of doc comment text lines preserved from the source. - A checked type after declared-type resolution. - A value matching that type. - A source span covering the const item. The constant's type and value are linked by construction: lowering is responsible for ensuring the value matches the declared type, including for union and generic types. ## Expressions Every IR expression belongs to one of the following forms. The form is decided by the expression itself; consumers use a type switch to dispatch. - Null expression. The single null literal. - Bool expression. A decoded `true` or `false`. - Integer expression. A decoded integer literal. The internal representation is a signed 64-bit integer at this stage. The Masterbelt `int` type currently maps to the host language's natural integer type when generating code. Future integer width types will extend this representation. - String expression. A decoded string literal with escape sequences resolved. - List expression. An ordered sequence of nested expressions. The element types are determined by the surrounding constant's type. - Map expression. An ordered sequence of key/value entries. After last-wins deduplication, no two entries have equal keys. - Product expression. An ordered sequence of named field initializers. Field initializers preserve the source order of the literal. The field name strings match the field names of the surrounding constant's product type; every field declared by that type appears exactly once. Field types are determined by the surrounding constant's type. - Reference expression. A reference to another constant. The expression carries the referent's module and name; when the module field is empty the reference targets a constant declared earlier in the same module, and when it names a foreign module the reference targets a public symbol of that module. Same-module forward references are rejected during checking; cross-module references resolve through the import system. Expressions do not carry their own type. Consumers derive types from the containing constant's declared type by walking it together with the expression tree, and from referenced constants when the expression is a reference. An expression's source span identifies the source it was lowered from. This makes diagnostics emitted by downstream phases attributable to user source. ## Lowering Lowering is the phase that produces an IR module from a checked source file. Its input is a checker result for one file; its output is one IR module and a list of diagnostics introduced by lowering itself. Lowering operates only on a checker result whose declarations have already been type checked. The set of diagnostics emitted by previous phases is not re-emitted by lowering; lowering appends only diagnostics that are first detected at this phase. ### Module Identity The lowered module's name is the source file's name. Its source span is the source file's span. The constants slice mirrors source order of declarations. ### Const Declarations Each const item from each const declaration contributes one constant to the IR module, in source order. The constant carries: - The source identifier name. - The public flag of the enclosing const declaration. All items in a `pub const ( ... )` group are public. - The checked type recorded by the checker for the item's binding. Type declarations have been resolved to their target type. - The doc comment lines attached to the const item, in source order, with the leading `///` removed. When a const declaration has a leading doc comment and items also have their own doc comments, the declaration's lines precede each item's lines. - The lowered expression value. - The source span of the const item. A const item whose checker-recorded type is the invalid type is not contributed to the IR. Its diagnostics have already been emitted by previous phases. ### Type Declarations A type declaration contributes one entry to the module's TypeDeclarations list. The entry preserves the declaration's source identifier, public flag, doc comments, declared type parameters, and the target type produced by checker-time substitution. The declared name is retained so downstream consumers can re-introduce it in target output even though the constant lowerings carry the resolved target type rather than the declared surface name. #### Anonymous Product Hoisting Anonymous product types written inline inside a declared body (for example `type Monster = { skills: list<{...}> }`) are normalized during lowering by hoisting each one to its own TypeDeclaration. The synthetic declaration's name is derived from the path that reaches the anonymous product: the owner's declared name followed by the PascalCase form of each field, with `Key` and `Value` suffixes for map argument positions. Anonymous products nested inside an already-hoisted declaration recurse with the synthesized name as the new owner so the path stays scoped to a single declaration tree. Synthesized names that would collide with an existing top-level name receive a numeric suffix. A synthesized declaration inherits the public flag of the owner so callers across modules can reach it, and carries the subset of the owner's type parameters that the hoisted body references. The original use site is rewritten to apply those parameters back, keeping the generic shape intact. After hoisting, the IR contains only named product types. Codegen targets see the synthesized declarations as ordinary TypeDeclarations and emit them through their normal product-type code path. ### Literal Expressions - `null` lowers to a Null expression. - `true` and `false` lower to Bool expressions carrying the decoded value. - An integer literal lowers to an Int expression carrying the decoded signed 64-bit value. The literal's recorded base (2, 8, 10, or 16) and digit text drive the decode. A literal whose magnitude does not fit in a signed 64-bit integer is reported as integer out of range and the item is not contributed to the IR. - A string literal lowers to a String expression carrying its already decoded value. ### Collection Expressions - A list literal lowers to a List expression whose items are the lowered element expressions in source order. - A map literal lowers to a Map expression whose entries are the lowered key/value pairs in source order, after applying last-wins deduplication: for each duplicate key, the position of the first occurrence is preserved and the value of the last occurrence replaces all earlier values for that key. Key equality is structural over already-lowered key expressions. - An empty collection literal lowers to an empty List or empty Map according to the surrounding constant's checked type after declared-type resolution. - A product literal lowers to a Product expression whose field initializers are the lowered values in source order. The product type used to type-check each field value is taken from the literal's typed prefix when present and otherwise from the surrounding constant's annotation after declared-type resolution. Field name uniqueness, missing fields, and unknown fields have already been validated by the checker. ### Identifier References An identifier reference in expression position lowers to a Reference whose name matches the referent's source identifier. The checker has already established that the referent is a const item declared earlier in source; the IR guarantee that references point only backward in the constants slice follows from that. The IR does not eagerly inline references. Consumers that need a reference's value walk to the referent constant. ## For Statements A for statement reaches the IR as a control-flow node nested inside a function body. The IR distinguishes three subject shapes so codegen can pick the matching native loop without re-deriving them. A for statement carries: - A subject shape — `list`, `map`, or `range`. - For `list` and `map` shapes, the lowered subject expression and the subject's checked element type(s) after declared-type resolution. - For the `range` shape, the lowered start and end expressions (the IR records the recognized counted form so codegen emits a counted loop directly). - One or two bindings depending on the shape; each binding carries a name (an empty name signals a `_` skip), the binding's checked type, and a source span. Map subjects always carry two bindings in `(key, value)` order. - A lowered function block for the loop body. - A source span covering the whole statement. Break and continue statements lower as bare nodes with only a source span. The IR does not resolve which loop they target; consumers walk the surrounding statement tree to attach them to the innermost loop the same way the checker did. The `range(start, end)` recognition is performed during lowering and is purely a representation choice: the IR's `list` element type is unchanged, so consumers that ignore the counted-form flag can still treat the subject as a list and remain correct. ## Match Statements A match statement reaches the IR as a control-flow node nested inside a function body. The IR preserves the surface ordering of arms and the bindings introduced by each pattern. A match statement carries: - The lowered subject expression. - The subject's checked type after declared-type resolution. - An ordered list of arms, in source order. - A source span covering the whole statement. A match arm carries: - An ordered list of one or more lowered patterns (the `|`-separated alternatives at one arm position). - An optional lowered guard expression. - An ordered list of bindings introduced by the pattern. Each binding carries a name, the binding's checked type, and the binding's source span. - A lowered function block for the arm body. - A source span covering the whole arm. A match pattern is one of: - A **type pattern** carrying the matched type after declared-type resolution and an optional binding entry referenced by name. - An **enum pattern** carrying the enum type and the variant name. - A **literal pattern** carrying the lowered literal expression. - A **product pattern** carrying the matched product type, an ordered list of field sub-patterns (each with the field name and a nested pattern), and an optional whole-value binding entry referenced by name. - A **wildcard pattern** carrying no payload. The implicit identifier narrowing described in [types.md](types.md#match-statements) is normalized during lowering: a subject expression that is a plain identifier and a pattern without an explicit binding produce an implicit binding entry whose name is the identifier's name and whose type is the narrowed type. Downstream consumers therefore see explicit bindings on every arm where a narrowed local would be reachable. Exhaustiveness, reachability, and binding-set agreement across alternatives are enforced by the checker; the IR records the arms verbatim and does not synthesize a catch-all arm. ## Master Validation A master's `validation` section reaches the IR as a `MasterValidation` value. Its surface form, scoping, and execution model are defined in [masterdata/validation.md](../masterdata/validation.md); the IR records the lowered rules so the validation evaluator can run them at export time. A `MasterValidation` carries: - `Master` — the flattened codegen name of the owning master (the name used by code-generation output, for example `UserFriendships`). - `VisiblePath` — the module-local dotted source path of the master (for example `User.Friendships`). The export driver rewrites the top segment of this path through the entry module's re-export aliases to obtain the entrypoint-visible path that validator configuration keys against (so an aliased re-export `pub { User as U }` yields `U.Friendships`). - `Rules` — an ordered list of `MasterValidationRule` values in source order. A `MasterValidationRule` carries: - `Scope` — the rule's scope, one of `MasterValidationEach` or `MasterValidationAll`. - `Name` — the validator's stable identifier, unique within the master across both scopes. - `Body` — the lowered statement block. The implicit bindings the evaluator supplies inside a rule body are determined by the scope rather than recorded as explicit IR bindings: - For an `each` rule, `row` and `self` are bound to one post-filter record; both have the master's record type. - For an `all` rule, `table` and `self` are bound to the master's post-filter relation; both have type `Relation`, and iterating it yields the post-filter records in plan order. A rule may apply the master's scopes and stage operators to that relation. - `self` is bound to the same value as `row` (in `each`) or `table` (in `all`). ## Master Scopes A master's [scope](../masterdata/schema.md#scope-section) declarations reach the IR as an ordered list of `MasterScope` values on the owning master. The surface form is defined in [masterdata/schema.md](../masterdata/schema.md#scope-section); the IR records each scope so code generation can emit a relation method and the SQLite exporter can infer secondary indexes. A `MasterScope` carries: - `Master` — the flattened codegen name of the owning master. - `Name` — the scope name, unique within the master. - `Public` — the `pub` flag controlling generated-API exposure. - `Indexed` — the `indexed` flag enabling SQLite index inference. - `Parameters` — the lowered parameter list, in source order, with the same shape a lowered function uses. - `Body` — the lowered scope body (a statement block or an arrow expression) whose result is the master's relation. ### Query Plan Shape To support both code generation and index inference, the lowered body exposes the scope's relation **plan**: the ordered stages applied to `self`. Each stage is one of: - `Where(predicate)` — a predicate tree whose leaves carry a `FieldRef` (the record field's source name), the operator (`eq`, `ne`, `lt`, `le`, `gt`, `ge`, `in`, `between`), and the operand (a literal, a scope-parameter reference, or a referenced const/static). Interior nodes are `and` / `or` / `not`. - `OrderBy(ordering)` / `ThenBy(ordering)` — a `FieldRef` plus a direction (`asc` / `desc`). - `Skip(n)` / `Take(n)` — a count operand. A scope call inside a body inlines the callee scope's stages into the caller's plan (with the callee's parameters bound to the call arguments), so the plan is self-contained and free of scope-call nodes. The plan order preserves source order across `where` / `orderBy` / `thenBy` / `skip` / `take` and across inlined chains. This is the representation the [SQLite index inference](../masterdata/export-sqlite.md#secondary-indexes-from-indexed-scopes) reads; index inference is performed on the lowered plan, not on source syntax. ### Assert Statement An `assert` statement lowers to an `Assert` node carrying: - `Condition` — the lowered condition expression. - `ExprText` — the condition's source text. The failed assertion's expression and source span are retained rather than collapsed to a boolean result. This lets a future PowerAssert-style reporter display sub-expression values without a surface or IR change; the MVP evaluator uses only the boolean outcome. ## Symbols and Visibility A top-level IR declaration is called a symbol. Every symbol has a name and a public flag derived from the source program's `pub` modifier. The set of symbol kinds is open: constants exist today; future kinds such as records, methods, callables, and type declarations will extend the same abstraction. Every symbol declares which other symbols it references. References originate in: - Identifier expressions inside value-position trees. - Future: type references inside type-position trees (such as a field type that names a user-declared record). - Future: call-site references inside callable bodies. The IR provides a single reachability operation that walks symbols starting from the public roots and follows the declared references transitively. Downstream consumers use this operation to identify the set of symbols a public surface depends on. The current consumers are code generation targets, but any consumer that needs the same notion (linker, indexer, future tree-shaking analyses) shares it. The reachability rule is consumer-neutral. The IR does not delete unreachable symbols; consumers choose how to act on the result. ## Effects The IR carries an open set of effect tags that callable symbols may declare. The currently defined effects are `cancellable`, `failable`, and `asyncable`. Each is defined and documented in `codegen/model`. Effect sets are ordered, deduplicated, and combined by union when a caller inherits a callee's effects. Downstream consumers may rely on the IR's effect set being canonical: equal effect sets compare equal positionally, with no duplicate members. A function type carries its effect set as part of its identity (see [language/types.md](types.md)). A type declaration whose target is a function type therefore exposes the effect set through its target type. The IR has no callable value-level nodes yet, so an effect set today reaches targets only through type expressions; the EffectSet type is also reserved for future callable nodes that will record effects directly at IR construction time. ## Stability The IR is an internal contract between checker and downstream consumers. Adding new value forms, new fields, or new normalization rules is a contract change and requires updating this document before or together with the implementation. # Diagnostics Source: https://masterbelt.dev/spec-src/language/diagnostics.md # Diagnostics ## Shape - A diagnostic has a code, severity, optional source span, and typed arguments. - Source spans identify a file name, start position, and end position. - Positions are zero-based and include byte offset, line, and column. ## Message Localization - Every user-visible diagnostic message must be defined in a locale catalog. - Implementation code must not embed user-visible diagnostic message strings. - Diagnostic codes, constants, severities, and argument schemas are defined in a diagnostic registry. - Locale catalogs contain only diagnostic code and localized message template columns. - The English locale is required during development. - Other locales, including Japanese, must be addable without changing diagnostic call sites. - Diagnostics store code, severity, span, and typed arguments. - Diagnostic reporters render localized messages from the locale catalog. - Diagnostics that are not tied to source text may omit the source span. - Catalog templates use `{name}` placeholders for arguments. - All placeholders used by a localized message must be declared by that diagnostic code's argument schema. - All diagnostic codes must have an English message. - Missing messages, unknown message codes, or mismatched placeholders are errors in generation or tests. - Diagnostic code catalogs are stable user-visible contracts. - Adding, removing, or renaming a diagnostic code is a user-visible behavior change. - At this stage, diagnostic argument schemas only define string arguments. ## Reporters - Text reporters write one localized diagnostic message per line. - JSON reporters write an object with a `diagnostics` array. - Each JSON diagnostic entry contains `code`, `severity`, `message`, optional `span`, and optional `args`. - When there are no diagnostics, JSON reporters write an empty `diagnostics` array. ## Severities - `error`: the source cannot be accepted for the requested operation. - `warning`: the source can be accepted, but the behavior should be reviewed. - `info`: informational output. - `hint`: editor-oriented guidance. ## Master Validation Diagnostics The master [validation](../masterdata/validation.md) feature registers the following diagnostic codes. ### Validator Evaluation | Code | Severity | Meaning | Arguments | | --- | --- | --- | --- | | `masterbelt.validation.assert_failed` | configured (default `error`) | An `assert` condition evaluated `false`. The span is the condition expression. | `master`, `validator`, `scope` (`each`/`all`), `record`, `expr` | | `masterbelt.validation.evaluation_failed` | error | A validation rule body raised a hard evaluation error. | `master`, `validator`, `scope`, `detail` | `masterbelt.validation.assert_failed` is emitted at the severity resolved from project configuration: the default is `error`, and a `(master, validator)` override to `warning` is reflected in the emitted diagnostic. The `master` argument is the entrypoint-visible master path; the `record` argument is the failing record's primary-key description for an `each` failure and `` for an `all` failure; `expr` is the condition's source text. ### Validator Configuration | Code | Severity | Meaning | Arguments | | --- | --- | --- | --- | | `masterbelt.validation.config_unknown_master` | error | A `validators` config key names a master that does not exist. | `master` | | `masterbelt.validation.config_unknown_validator` | error | A `validators` config key names a validator that does not exist on a known master. | `master`, `validator` | | `masterbelt.validation.config_invalid_severity` | error | A `validators` severity is not `error` or `warning`. | `master`, `validator`, `severity` | | `masterbelt.validation.ambiguous_master` | error | A master is re-exported from the entry module under more than one name, so no config path identifies it unambiguously; its validators are skipped. | `master`, `aliases` | A master that the entry module neither declares nor re-exports has no entrypoint-visible config path, so its validators are out of scope and run no diagnostics. ### Checker | Code | Severity | Meaning | Arguments | | --- | --- | --- | --- | | `masterbelt.checker.validator_duplicate` | error | A duplicate validator id within a master. | `master`, `name` | | `masterbelt.checker.assert_outside_validation` | error | `assert` used outside a validation block. | — | | `masterbelt.checker.return_in_validation` | error | `return` used inside a validation block. | — | | `masterbelt.checker.assert_condition_non_bool` | error | An `assert` condition is not `bool`. | `actual` | ### Parser | Code | Severity | Meaning | | --- | --- | --- | | `masterbelt.parser.assert_missing_condition` | error | An `assert` statement has no condition expression. | | `masterbelt.parser.master_validation_group_missing_scope` | error | A validation group is missing its `each` / `all` scope. | | `masterbelt.parser.master_validation_rule_missing_name` | error | A `validate` rule is missing its identifier. | | `masterbelt.parser.master_validation_rule_missing_body` | error | A `validate` rule is missing its body block. | | `masterbelt.parser.unexpected_master_validation_node` | error | An unexpected node appeared inside a `validation` section. | A duplicate `validation` section within a master reuses the existing `masterbelt.parser.master_section_duplicate` code. ## Scope Diagnostics The master [scope section](../masterdata/schema.md#scope-section) and the SQLite indexed-scope inference contribute the following diagnostics. ### Parser | Code | Severity | Meaning | | --- | --- | --- | | `masterbelt.parser.master_scope_missing_name` | error | A `scope` declaration has no name. | | `masterbelt.parser.master_scope_missing_body` | error | A `scope` declaration has no body. | ### Checker | Code | Severity | Meaning | Arguments | | --- | --- | --- | --- | | `masterbelt.checker.duplicate_scope` | error | A duplicate scope name within one master. | `master`, `name` | | `masterbelt.checker.scope_name_conflict` | error | A scope name collides with a relation method, nested master, static, function, or const. | `master`, `name` | | `masterbelt.checker.scope_return_type_mismatch` | error | A scope body returns something other than the declaring master's `Relation`. | `master`, `actual` | | `masterbelt.checker.scope_missing_return` | error | A block-body scope has no `return`. | `master`, `name` | | `masterbelt.checker.scope_forbidden_effect` | error | A scope body or query callback inherits `failable` / `cancellable` / `asyncable`. | `master`, `name`, `effect` | | `masterbelt.checker.cyclic_scope` | error | A scope references itself directly or transitively. The span is the cycle-closing call's callee. | `master`, `name` | | `masterbelt.checker.scope_unknown_field` | error | A query callback references a field the record does not declare. | `master`, `field` | Some scope misuse reuses existing diagnostics rather than a scope-specific code: declaring or assigning `self` is reported as `masterbelt.parser.reserved_identifier` because `self` is lexically reserved (see [lexical.md](lexical.md#keywords)); a scope call with the wrong argument count or types is reported through the general call diagnostics (`masterbelt.checker.call_argument_count_mismatch`, `masterbelt.checker.call_argument_type_mismatch`); and a scope call on a receiver that does not expose the scope is reported as `masterbelt.checker.unknown_member`. ### SQLite Index Inference | Code | Severity | Meaning | Arguments | | --- | --- | --- | --- | | `masterbelt.scope.index_inference_failed` | warning | An `indexed scope` could not be fully turned into a secondary index; any inferable part is still generated. | `master`, `scope` | | `masterbelt.scope.index_generated` | info | A secondary index was generated from an `indexed scope`. | `master`, `scope`, `index` | # Standard Library Source: https://masterbelt.dev/spec-src/language/std.md # Standard Library This document will define the Masterbelt standard library surface. # Master Data Schema Source: https://masterbelt.dev/spec-src/masterdata/schema.md # Master Data Schema A master is a user-declared, named, externally-populated collection of records. The master declaration is a top-level declaration of the language. Its surface form is defined together with the other declaration forms in [../language/syntax.md](../language/syntax.md); this document defines its semantics. ## Master Declarations ```mst master Records { record { primary id: int, name: string, } source { csv "path/of/file.csv" { separator: ",", } csv "another/file.csv" } } ``` A master declaration introduces a nominal type named by its identifier. The name lives in the same value-and-type identifier name space as a [type declaration](../language/syntax.md#type-expressions), so a master declaration and a type declaration with the same name within one file collide and are reported by [language/names](../language/names.md). A master declaration is a [visible declaration](../language/syntax.md#declarations). The optional `pub` modifier makes the master visible outside its declaring file. A master declaration may carry documentation comments. They attach to the master declaration as a whole, not to any section inside the body. ## Body Sections The body of a master declaration is a brace-delimited list of sections. Seven section kinds are defined: - `record { ... }` — the record section. Declares the element record type. Required. - `source { ... }` — the source section. Declares the data sources used to populate the master at import time. Optional. - `filter { ... }` — the filter section. Declares import-time row filters. Optional. - `validation { ... }` — the validation section. Declares record- and collection-level validators run after filtering at export time. Optional. See [Validation Section](#validation-section). - `static { ... }` — the static section. Declares constants and methods reachable through the master's surface name (`Master.X`). Optional. - `select Name { ... }` — a select section. Declares a named projection that maps a subset of the record's fields onto a derived record type. Optional, repeatable. See [Select Section](#select-section). - `master Name { ... }` — a nested master declaration. Reachable as `Parent.Name`. See [Nested Masters](#nested-masters). The five single-kind sections (`record`, `source`, `filter`, `validation`, `static`) may each appear at most once. A second occurrence of the same kind is a syntax error reported on the later occurrence as `masterbelt.parser.master_section_duplicate`. Select sections and nested master declarations may appear any number of times. A master declaration without a record section is a syntax error reported on the declaration as `masterbelt.parser.master_record_missing`. Sections may appear in any order. Tooling-level conventions (formatter) may impose a canonical order; the language itself does not. ## Record Section The record section declares the master's element record type. Its body is a [product type](../language/syntax.md#product-types-and-literals) body: a comma-separated list of fields and methods. Field modifiers `readonly` and `writable` carry the same meaning as on any product type. The additional field modifier `primary` is described in [keys.md](keys.md). Field names within the record section must be unique; duplicates are reported by the parser exactly as for any product type. Methods declared inside the record section have the same semantics as product type methods. Method overload resolution and method dispatch are defined in [../language/types.md](../language/types.md). The record section's body must declare at least one field; a record body that declares only methods, or that is empty, is a checker error. ## Source Section The source section declares external data sources used to populate the master at import time. The body is a sequence of source entries. ```ebnf source_entry = source_kind string_literal [ source_options ] ; source_kind = identifier ; source_options = "{" [ source_option { "," source_option } [ "," ] ] "}" ; source_option = identifier ":" expression ; ``` A source entry begins with a source-kind identifier. The supported source kinds are: - `csv` — see [import-csv.md](import-csv.md). Additional kinds (for example `xlsx`) will be defined by extending this list. Until a kind appears in this list it is rejected as `masterbelt.checker.master_unknown_source_kind`. The path string following the source kind names the resource to import. Resolution of this path against the project's working directory is defined by the importer specification and by [../tooling/configuration.md](../tooling/configuration.md). An option list is optional. When written, it is a brace-delimited, comma-separated list of `name: value` pairs. A trailing comma after the last option is allowed. An empty option list `{}` is legal and equivalent to omitting the option list entirely. Each option key is validated against the option schema declared by the source kind's importer specification: - An option name not declared by the importer is reported as `masterbelt.checker.master_source_option_unknown`. - An option value whose type does not match the importer's declared option type is reported as `masterbelt.checker.master_source_option_type_mismatch`. Duplicate option names within one entry are reported by the parser on the later occurrence as `masterbelt.parser.master_source_option_duplicate`. A source section with zero entries is legal at the language level; the driver treats such a master as having no automatic data source. ## Filter Section The filter section declares row-level filters applied after a record has been read from a source. Each rule has a kind keyword (`include` or `exclude`), a reason string, and a brace-delimited body that evaluates a boolean expression against the candidate record: ```mst master A { record { value: int } filter { include "non-negative" { return self.value >= 0 } exclude "outlier" { return self.value > 100 } } } ``` ### Body Semantics A filter rule body is a brace-delimited block. The block must `return` a value of type `bool`. The receiver of the body is the candidate record, available as the local `self` typed as the surrounding master's record type. ### Rule Application Rules apply in source declaration order against each candidate record. The first rule that fails drops the record from the master; no later rule on the same record is evaluated. - An `include` rule drops the record when its body returns `false`. - An `exclude` rule drops the record when its body returns `true`. Every dropped record produces a `masterbelt.importer.filter_excluded` diagnostic at hint severity. The diagnostic carries the rule's reason string so tooling can surface which rule caused the drop. ### Diagnostics - A filter body whose return type is not assignable to `bool` is reported through the standard return-type-mismatch diagnostic. - A filter section that contains no rules is legal at the language level. ## Validation Section The validation section declares record- and collection-level data quality checks. It is distinct from the filter section: - A `filter` rule **drops** a record from the master. - A `validation` rule **keeps** every record and emits a diagnostic. The two also run at different times: filters run during import while the record set is being assembled; validators run over the final, post-filter dataset. The full order is import → filter → validation → export-write, and validation runs at [`masterbelt export`](../tooling/cli.md) time before any artifact is written. The section groups named validators by scope (`each` per record, `all` per collection) and asserts conditions over the data. The complete surface form, scoping, execution model, severity, and diagnostics are defined in [validation.md](validation.md). ## Static Section The static section declares user-defined constants and methods that hang off the master's surface name as static members. The section is a brace-delimited list of [const declarations](../language/syntax.md#declarations) and [function declarations](../language/syntax.md#declarations); both forms accept the same modifiers and syntax they take at the top level (visibility, doc comments, effects, generics). ```mst master Items { record { primary id: int, value: int } source { csv "data/items.csv" } static { pub const MaxId: int = 9999 pub fn total(): int { let acc: int = 0 for item in Items.toList() { acc = acc + item.value } return acc } } } ``` The static section's members appear under the master's surface name as **static members**: a constant is reached as `Items.MaxId`; a method is reached as `Items.total()`. Visibility follows the regular `pub` rule and is independent from the master's own visibility — a `pub master` may expose private static members and vice versa. ### Scope and Resolution A static member's body resolves names through the regular module scope: top-level constants, top-level functions, other masters' static members, and the master's own static members are all in scope. Module imports apply the same way they do for any other declaration. The identifier `self` is **not** in scope inside a static section. The static section is detached from any record instance, so `self` would have no meaningful binding; references resolve through the master's surface name instead (`Items.toList()`, `Items.MaxId`). A use of `self` inside a static body is reported through the regular unbound-name diagnostic. The master's own record fields are not exposed inside the static section either. A static body that needs to inspect records must iterate through `Master.toList()` (see [Iteration](#iteration)). ### Filter Interaction A `filter` rule body MUST NOT call a master static method, because filter rules run during import while the master's record set is still being assembled. The checker rejects such calls with `masterbelt.checker.static_call_from_filter`. Static *constants* referenced from a filter body are permitted because they do not consult import state. ### Built-in Members The static section's user-declared members coexist with the built-in static surface every master exposes (currently the `toList()` method described under [Iteration](#iteration)). Declaring a user member whose name collides with a built-in is reported through the regular duplicate-member diagnostic. ### Diagnostics - A duplicate member name within one master's static section is reported as `masterbelt.checker.static_member_duplicate`. - A member name that collides with a built-in master member is reported as `masterbelt.checker.static_member_reserved`. - A static call from inside a filter rule body is reported as `masterbelt.checker.static_call_from_filter`. ## Select Section A select section declares a named projection over the master's record. Each select section introduces one projected record type derived from a subset of the master's record fields, together with a query surface that filters, orders, and consumes records through the projected shape. ```mst master Items { record { primary id: int, name: string, count: int, } select Summary { id: id, name: name, } } ``` ### Body Form The body of a select section is a brace-delimited, comma-separated list of `target: source` field mappings. A trailing comma after the last mapping is allowed. The explicit `target: source` spelling is required even when the names match; the form is forward-compatible with future revisions that may permit renaming and limited type conversions. ```ebnf master_select_section = "select" identifier "{" [ select_field { "," select_field } [ "," ] ] "}" ; select_field = identifier ":" identifier ; ``` ### Constraints The current revision allows only same-name same-type extraction. Each mapping must: - Use the same identifier on both sides (`target == source`). A renaming mapping is reported as `masterbelt.checker.master_select_field_rename_unsupported`. - Reference a field that exists on the master's record. An unknown source is reported as `masterbelt.checker.master_select_source_unknown`. - Refer to a field with a primitive type (`bool`, numeric, `string`). A non-primitive field (a `ref<>`, `list<>`, `map<>`, or nested product) is rejected as `masterbelt.checker.master_select_unsupported_field_type`. - Use a target identifier that is unique within the same projection. A duplicate target is reported as `masterbelt.checker.master_select_duplicate_field`. Two select sections under the same master must have distinct names. A duplicate is reported as `masterbelt.checker.master_select_duplicate_name`. Cross-master name overlap is fine — the codegen identifier carries the master's name as a prefix. A select section's body with zero mappings is legal at the language level but rejected by the checker through `masterbelt.checker.master_select_duplicate_field` when an empty body would otherwise be meaningless. ### Generated Names Each `select Name { ... }` on `master Master { ... }` lowers to the following codegen-side identifiers: | Codegen artifact | Spelling | | --- | --- | | Projected record type | `Record` | | Projected relation type | `Relation` | | Projected field builder | `Fields` (Go / C#); per-target convention for TypeScript | | Source-relation projection accessor | `Select()` (Go / C#) / `select()` (TypeScript) | For `master Items { select Summary { id: id, name: name } }`, the Go target emits `ItemsSummaryRecord`, `ItemsSummaryRelation`, `ItemsSummaryFields`, and a `SelectSummary()` accessor on the source `ItemsRelation`. ### Query Surface A projected relation exposes the same stage and terminal operations as the source relation (defined in [query.md](query.md)), parametrised on the projected record type. The terminals carry the same asyncable + cancellable + failable triplet the source terminals carry. The accumulated source-side state (predicates, orderings, skip, take) flows from the source relation into the projected relation at `Select()` time, so the projection inherits any filters / orderings the user already chained. Predicates added through the projected relation are typed on the projected record, so authoring `Where(ItemsSummaryFields.Name.Eq("alpha"))` against an `ItemsSummaryRelation` is a compile-time match; passing a source-record predicate to the projected relation is a compile-time error. ### Semantics A projection runs at terminal time, after the source records have been filtered, ordered, skipped, and taken according to the accumulated state. For each surviving source record the runtime copies the named fields into a fresh target record. The relation's underlying record set is never reshaped — projection is a view that materialises on each terminal call. A projected relation does not re-import data or mutate the source relation. Two terminal calls observe the same record set under the same source-side state, matching the observational behaviour of [Iteration](#iteration) and the [Query API](#query-api). ## Scope Section A scope section declares a named, parameterisable relation query that hangs off the master's [relation](#runtime-model) surface. Unlike the codegen-only `Where` / `OrderBy` operators, a scope is authored in Masterbelt source: it names a reusable query fragment that the source program — and, for `pub` scopes, the generated target API — can apply to a relation. ```mst master Records { record { primary id: int, name: string, age: int, gender: int, } scope adult() { return self.where(fn(row) => row.age.ge(20)) } scope gendered(gender: int) { return self.where(fn(row) => row.gender.eq(gender)) } pub scope genderedAdult(gender: int) { return self.adult().gendered(gender) } } ``` A scope is declared only inside a master body; there is no top-level or module-level scope. A [nested master](#nested-masters) may declare its own scopes, which surface on that nested master's relation. A scope always returns the declaring master's `Relation`; it never returns a `list`, a nullable, a projected relation, or a joined relation. ### Surface Form ```ebnf master_scope_declaration = [ visibility_modifier ] [ "indexed" ] "scope" identifier "(" [ function_parameters ] ")" ( function_block | scope_expression_body ) ; scope_expression_body = "=>" expression ; ``` A scope declaration carries an optional `pub` visibility modifier in the same position as every other declaration's visibility, an optional `indexed` modifier (see [Indexed Scopes](#indexed-scopes)), the `scope` keyword, a name, a parameter list, and a body. The `pub indexed scope` order is the only accepted spelling of the two modifiers; `indexed pub scope` and `scope indexed` are syntax errors. Both `scope` and `indexed` are context keywords (they remain usable as ordinary identifiers elsewhere); see [lexical.md](../language/lexical.md#keywords). A scope never declares an explicit return type — the return type is always the declaring master's `Relation`, so a return-type annotation is a syntax error — and a scope never carries type parameters. ### Parameters Scope parameters use the same surface form as regular [function parameters](../language/syntax.md#declarations): the permitted parameter types, default values, optional / nullable parameters, and the call-site omission rules are identical to those of a regular function. Scope parameters are referenced from the body, including from inside query callbacks. ### Body A scope body is either a function block or an arrow expression: - A **block body** must `return` a `Relation`. `let`, `if`, `for`, assignment, `break`, `continue`, and multiple / conditional `return`s are all allowed; a block body with no `return` is reported as `masterbelt.checker.scope_missing_return`. - An **arrow body** (`=> expression`) requires its expression to evaluate to a `Relation`. In both forms a body expression whose type is not the declaring master's `Relation` is reported as `masterbelt.checker.scope_return_type_mismatch`. The body is **effect-free**: it builds and returns a relation plan and must not inherit `failable`, `cancellable`, or `asyncable`. A scope body — or a query callback inside it — that calls a user function, static, or const carrying any of those effects is reported as `masterbelt.checker.scope_forbidden_effect`. A scope is therefore never itself failable / cancellable / asyncable. Terminal relation operations (`toList`, `findBy`, …) carry those effects and so cannot be reached from a scope body; the constraint is enforced through the effect rule rather than by enumerating forbidden methods. ### `self` Inside a scope body, `self` is the **receiver relation**: the `Relation` the scope is applied to. `self` is a reserved implicit identifier in every context (see [lexical.md](../language/lexical.md#keywords)); a binding that tries to shadow it — as a function or scope parameter, a `let` / `const`, or a `for` / `match` binding — is rejected, and assigning to `self` is rejected. When a scope is reached through a [scope chain](#scope-chaining), its `self` denotes the relation produced by the stages applied earlier in the chain. ### The Relation Query DSL A scope body reaches the relation query operators that are otherwise codegen-only. The operator vocabulary, the field-handle DSL (`row.age.ge(20)`, `row.name.asc()`), and the `and` / `or` / `not` combinators are defined in [query.md](query.md#source-level-relation-queries). A scope body uses `where`, `orderBy`, `thenBy`, `skip`, and `take`; it never introduces a new query surface such as `self.fields.*`. ### Calling and Chaining A scope surfaces as a method on its declaring master's `Relation` and **only** on that relation; it cannot be called on another master's relation, on a `Record`, or on a `list`. A scope is reached from a relation receiver: - The master's surface name is the base relation entrypoint, so `Records.gendered(1)` applies `gendered` to the master's base relation. An imported master's scopes are reachable through the import alias the same way. - Within a scope body, `self.adult()` applies `adult` to the receiver relation. - Scopes chain: `self.adult().gendered(gender)` and `Records.adult().gendered(1)` apply each scope in turn, threading the relation produced by one stage into the next scope's `self`. Scope chaining does not depend on source order; a scope may chain a scope declared later in the same master. A scope that references itself directly or transitively is reported as `masterbelt.checker.cyclic_scope` — recursive scopes are forbidden. A scope call whose receiver does not expose the scope is reported as `masterbelt.checker.unknown_member`, and an argument-count / type mismatch through the general call diagnostics. A query callback referencing a field the record does not declare is reported as `masterbelt.checker.scope_unknown_field`. ### Visibility A scope's `pub` modifier controls **generated-target API exposure only**. A `pub` scope is emitted as a method on the generated relation type in every codegen target; a non-`pub` scope is not emitted into the target API. Visibility does not restrict source-level use: a non-`pub` scope is still callable from Masterbelt source across module imports, exactly like a non-`pub` scope inside the declaring module. Generated method casing follows each target's relation-API convention — source `genderedAdult` is `GenderedAdult` on Go / C# and `genderedAdult` on TypeScript. The full per-target shape is defined in [codegen/golang.md](../codegen/golang.md), [codegen/typescript.md](../codegen/typescript.md), and [codegen/csharp.md](../codegen/csharp.md). ### Indexed Scopes The `indexed` modifier marks a scope as a source for SQLite secondary-index inference. It carries no source-level semantics and does not change the generated non-SQLite API; non-SQLite targets ignore it without a diagnostic. When the SQLite backend is in use, the export pipeline infers secondary indexes from the `where` and order-by stages an indexed scope can produce. The inference rules, the generated DDL, deduplication, naming, and the partial-success / failure diagnostics are defined in [export-sqlite.md](export-sqlite.md#indexed-scope-secondary-indexes). `indexed` and `pub` are independent: `pub scope` is exposed but builds no index, `indexed scope` builds an index but is not exposed, and `pub indexed scope` does both. ### Naming and Resolution A scope name is unique within one master; a duplicate is reported as `masterbelt.checker.duplicate_scope`, and scope overloading is not allowed. A scope name lives in the relation's method namespace: it must not collide with a built-in relation method (`where`, `orderBy`, `thenBy`, `skip`, `take`, terminals) or with the master's nested-master / static / function / const names, but it does **not** collide with a record field of the same name, because `Record` and `Relation` are distinct types with distinct namespaces. A collision is reported as `masterbelt.checker.scope_name_conflict`. ### Diagnostics - A duplicate scope name within one master is reported as `masterbelt.checker.duplicate_scope`. - A scope name that collides with a relation method, nested master, static, function, or const is reported as `masterbelt.checker.scope_name_conflict`. - Declaring or assigning `self` is reported as `masterbelt.parser.reserved_identifier`, because `self` is lexically reserved (see [lexical.md](../language/lexical.md#keywords)). - A scope body that does not return the declaring master's `Relation` is reported as `masterbelt.checker.scope_return_type_mismatch`; a block body with no `return` as `masterbelt.checker.scope_missing_return`. - A scope body or query callback that inherits a forbidden effect is reported as `masterbelt.checker.scope_forbidden_effect`. - A direct or transitive self-reference is reported as `masterbelt.checker.cyclic_scope`. - A scope call on a receiver that does not expose the scope is reported as `masterbelt.checker.unknown_member`; an argument mismatch through the general call diagnostics; an unknown field reference inside a callback as `masterbelt.checker.scope_unknown_field`. ## Nested Masters A master declaration may contain other master declarations in its body. Each nested master is a full master declaration: it carries its own record, source, filter, static, and further-nested-master sections, and follows the same well-formedness rules as a top-level master. ```mst master User { record { primary id: int, name: string } source { csv "data/users.csv" } master Friendships { record { primary id: int, owner: int, friend: int } source { csv "data/friendships.csv" } } } ``` A nested master is reached through the parent's surface name with a dot: `User.Friendships`, `User.Friendships.toList()`, `User.Friendships.SomeStatic`. The dotted form behaves identically to a top-level master reference; nested masters are nominal types in the same name space as their parent's static members. ### Naming Two nested masters under the same parent must differ in name. A duplicate is reported as `masterbelt.checker.nested_master_duplicate`. A nested master's name must not collide with any of the parent's own static members (constants, methods, or other nested masters). A collision is reported as `masterbelt.checker.static_member_duplicate`. ### Visibility A nested master may carry the `pub` modifier. Visibility is independent from the parent's: a `pub master` may contain a private nested master, and a private parent may contain a `pub` nested one. Cross-module references resolve through the dotted form (`OtherModule.Parent.Child`). ### Codegen Each codegen target emits nested masters as siblings of top-level masters under a flattened identifier built by concatenating the parent's surface name and the nested master's surface name (`UserFriendships` for `master User { master Friendships { ... } }`). When the nesting depth is greater than one, the concatenation extends through every ancestor in declaration order (`UserFriendshipsArchive` for two levels of nesting). The flattened identifier is the only name visible at the target level. Record and relation types follow the same naming policy as top-level masters — `UserFriendshipsRecord`, `UserFriendshipsRelation` — and the MasterData accessor for a nested master is flat (`data.UserFriendships`), not a chained property. Cross-references in the generated source see the same flattened identifier; the dotted Masterbelt-side path does not survive into the target. ### Limits A nested master inherits no behavior from its parent: it carries its own record, sources, filters, statics, and iteration semantics. Records from the parent and records from a nested master are independent collections. The body of a nested master may declare further-nested masters, with no fixed depth limit at the language level. ## Runtime Model The Masterbelt surface form `master Foo { ... }` does not introduce a runtime singleton. Each master decomposes into two distinct runtime concepts: - **Record** — the pure value type for one row, derived from the master's `record { ... }` section. - **Relation** — the typed query / access surface over the master's records. Methods like iteration, primary-key lookup, and (in later revisions) richer queries hang off the relation. A program reaches every relation through a **MasterData** value: the dataset entry that holds one materialised import. MasterData is not a singleton. A host application that needs to swap datasets per client (for example "Client v1 receives this content, Client v2 receives that") constructs and routes distinct MasterData values; the surface program never sees the choice. The mapping from a Masterbelt-surface `master Foo` to its runtime parts is fixed: | Surface | Runtime | | --- | --- | | Record type used in user code | `Record` | | Query / access surface | `Relation` | | Dataset entry | `MasterData`, with a flat accessor per master | | Nested master `Parent.Child` | `Record` / `Relation`, reached through the same MasterData with a flat accessor (`data.parentChild` / `data.ParentChild` per target convention) | The Masterbelt source program does not refer to MasterData. The implementation threads the active dataset through whatever mechanism the codegen target chooses (Go: through `context.Context`; TypeScript / C#: through a `data` parameter passed to terminals). A planner writes `Items.toList()` and the implementation rewrites the call against the active dataset. The runtime model belongs to codegen targets. The Masterbelt source language never names `Record`, `Relation`, or `MasterData`: those identifiers are codegen-side only. ### Primary Key Lookup Every relation exposes a built-in primary-key lookup operation that returns the record matching a given primary-key value, or a per-target "no match" sentinel when no record matches. The lookup is part of the runtime model; the Masterbelt source program never names it directly. The method carries the `asyncable`, `cancellable`, and `failable` effects; see [Effect Inheritance](../language/types.md#effect-inheritance). Because the Masterbelt source program never names the lookup, no caller has to acknowledge the inheritance; the effect set is baked into each target's `FindBy` emission directly so backends that can fail (Phase 4 JSON loader, future SQLite, ...) surface that path on the generated signature. #### Naming The lookup method is always named `FindBy` (Go / C# PascalCase) or `findBy` (TypeScript camelCase), regardless of the master's primary-key field name or arity. Composite keys disambiguate through the positional argument list rather than through the method name; relations live on their own type, so there is no overload ambiguity even when several masters share a key field name. A master rejected by the checker as `master_primary_missing` ([keys.md](keys.md)) has no primary key and therefore no `FindBy` method. #### Signature The method takes one positional parameter per primary-key field, named after the field and typed with the field's checked type, in source declaration order. There is no aggregate key struct. The return shape follows each target's idiomatic "optional value" convention. The `failable` effect surfaces per the per-target rule (Go: trailing `error` return; TypeScript / C#: a thrown exception): | Target | Return | | --- | --- | | Go | `(Record, bool, error)` — the second result is `true` when a record matched and `false` together with the record's zero value when no record matched. The third result is the backend failure: `nil` when the lookup completed (whether or not a match was found) and non-`nil` when the underlying read failed. | | TypeScript | `Promise<Record \| undefined>` — `findBy` is declared `async` and returns `undefined` for no match. A backend failure surfaces as a thrown exception. | | C# | `Task<Record?>` — `FindBy` is declared `async` and returns `null` for no match. A backend failure surfaces as a thrown exception. | Targets that thread the active dataset through a `context.Context` (Go) or an analogous per-target slot keep the same threading convention for `FindBy` as for the other relation methods. The `cancellable` effect adds the threading slot (`ctx context.Context`, `signal: AbortSignal`, `CancellationToken cancellationToken`); the `asyncable` effect wraps the return in `Promise` / `Task` on TypeScript and C#. #### Semantics The lookup returns the first record whose primary-key fields are all equal to the supplied arguments using each target's structural equality on the primary-key field's checked type. Equality on primitive primary-key types follows the host language's natural equality (Go and C# `==`, TypeScript `===`). A master with an empty record set returns the per-target "no match" sentinel for every call. The lookup is observational: it does not re-import data and does not mutate the relation. A subsequent call observes the same record set as the first call within one run, matching the behaviour of [Iteration](#iteration). ### Query API Every relation is itself the chainable query surface: stage operations (`Where`, `OrderBy`, `ThenBy`, `Skip`, `Take`, projection, join) return a fresh relation, and terminal operations (`ToSlice` / `ToList` / `toArray`, `Iter` / `AsAsyncEnumerable`, `FindBy`, `FirstOrDefault`, `Count`, `Any`) execute the accumulated plan against the active dataset. The cross-target model, the operator vocabulary, and the runtime plan AST are defined in [query.md](query.md). ### Join Operator A master whose record carries a `ref` field (see [Reference Fields](relations.md)) exposes a per-ref **join relation** on its query surface. Each such field automatically generates a parallel relation that walks the source records, resolves the ref's expanded primary-key fields against the target relation's `FindBy`, and yields a pair of `(left, right)` records per successful match. The join's full contract — pair record, pair field builder, joined relation type, INNER JOIN semantics, and how source-side state flows into the join — is defined in [query.md](query.md#joins). ## JSON Export Format A project's imported master data is serialised to a single JSON document with the flat shape: ```json { "items": [ {"id": 1, "value": 10}, {"id": 2, "value": 20} ], "userFriendships": [ {"owner": 7, "friend": 9} ] } ``` - Top-level keys are the masters' **flat camelCased identifiers**: the same names the per-target MasterData accessor uses (`items`, `userFriendships`). A nested master `master User { master Friendships { ... } }` appears under the flattened-then-camelCased key (`userFriendships`). - Each value is an array of record objects. Records preserve the importer's row order; a master that declared no source section appears with an empty array. - Per-record keys are the master's surface field names as written in source (`id`, `name`, `userId`). Records sort their keys lexicographically inside the file so the on-disk shape is deterministic. - Primitive values map directly: `bool` to JSON `true`/`false`, `string` to JSON string, `null` to JSON `null`, integers whose absolute value fits in 2^53 to JSON number. Integers outside that range serialise as quoted strings so JavaScript-side consumers receive a lossless representation. - Composite values nest naturally: `list` becomes a JSON array, `map` becomes a JSON object keyed by the entries' rendered keys, a nested product becomes a JSON object whose keys follow the same lexicographic sort rule as a top-level record. - The `ref` field expands to the target master's primary-key fields (`field_pk1`, `field_pk2`, ...) before serialisation; nothing in the JSON output reveals the original ref shape. The format is the same regardless of codegen target. See [export-json.md](export-json.md) for the exporter contract and [codegen/golang.md](../codegen/golang.md#master-data), [codegen/typescript.md](../codegen/typescript.md#master-data), and [codegen/csharp.md](../codegen/csharp.md#master-data) for the per-target `LoadJSON` / `loadJSON` / `LoadJson` helper signatures. ## Iteration A master declaration exposes the surface method `toList()` on the master's name. The call returns every imported record in import order; records dropped by the master's filter section ([Filter Section](#filter-section)) are not included in the result. The method carries the `asyncable`, `cancellable`, and `failable` effects; any callable that transitively reaches `toList()` inherits the same effect set silently per [Effect Inheritance](../language/types.md#effect-inheritance). Each codegen target lifts the inherited effects on the surrounding callable's surface (Go: the `ctx` parameter plus an `error` second result; TypeScript: `async`, `AbortSignal`, `Promise`; C#: `async`, `CancellationToken`, `Task`). The Masterbelt source program never writes any of these threading slots; the planner just calls `Items.toList()`. ```mst master Items { record { primary id: int, name: string } source { csv "data/items.csv" } } fn each() { for item in Items.toList() { use(item.id, item.name) } } ``` `Items.toList()` is the surface form. At the target level it lowers to the chainable relation's list terminal (Go `Items.ToSlice(ctx)`; C# `Items.ToList(data, cancellationToken)`; TypeScript `items.toArray(data, signal)`); the planner never writes the dataset directly. The lowering is documented per target in [codegen/golang.md](../codegen/golang.md#master-data), [codegen/typescript.md](../codegen/typescript.md#master-data), and [codegen/csharp.md](../codegen/csharp.md#master-data). A subsequent call to `toList()` observes the same record set as the first call within one run. Records are not re-imported on each call; the method is a view over the already-materialised collection. `toList()` is a static member: it is reached through the master's declaration name (`Items.toList()`), not through an instance. The master type name itself does not denote a value; only the static method does. See [query.md](query.md) for the cross-target query model. ## Reserved Keywords The identifiers `master`, `record`, `source`, `filter`, `include`, `exclude`, `primary`, `static`, and `select` are reserved by the master data schema and cannot be used as identifiers in any position. The [scope section](#scope-section) keywords `scope` and `indexed` are **context keywords**: they are matched only at the scope-declaration position inside a master body and remain usable as ordinary identifiers elsewhere. The implicit relation receiver `self` is fully reserved (see [lexical.md](../language/lexical.md#keywords)). Source-kind identifiers (for example `csv`) are not reserved: they are matched only at the source-kind position inside a source section. # Master Data Keys Source: https://masterbelt.dev/spec-src/masterdata/keys.md # Master Data Keys This document defines primary keys, unique keys, indexes, and lookup contracts for master data. ## Primary Keys A primary key uniquely identifies a record within a master. The `primary` field modifier marks a record field as part of the primary key. It appears in the same position as `readonly` and `writable` and is mutually exclusive with them. ```mst master Items { record { primary id: int, name: string, } } master Localizations { record { primary locale: string, primary key: string, text: string, } } ``` A master whose record section declares no `primary` field is a checker error reported as `masterbelt.checker.master_primary_missing`. A master whose record section declares two or more `primary` fields has a composite primary key. The composite key tuple is ordered by source declaration order. A `primary` field's type must be a value type for which equality is defined by [../language/types.md](../language/types.md). A `primary` field whose type is not equality-comparable is a checker error. `primary` outside a master's record section is a checker error reported as `masterbelt.checker.primary_outside_master_record`. ## Unique Keys and Indexes Unique keys, secondary unique indexes, and lookup-by-index contracts are not yet specified. ## Lookup Contracts Lookup operations against a master by primary key, and the behavior of duplicate-key import failures, are not yet specified. # Master Data Relations Source: https://masterbelt.dev/spec-src/masterdata/relations.md # Master Data Relations This document defines master-to-master references. The `ref` field also drives the per-field Join builder on the source query — see [Join Operator](schema.md#join-operator). Richer relation contracts (cardinality, cascade behavior) will be defined in future revisions of this document. ## Reference Type A `ref` value identifies one row of a master `M` by its primary key. ```mst master A { record { primary id: int, primary name: string, } } master B { record { primary id: int, aRecord: ref, } } ``` `ref` is a built-in generic type with one type argument. The argument must be a master type; any other argument is reported as `masterbelt.checker.ref_non_master_target`. ## Field Expansion A field whose type is `ref` is expanded into the target master's primary-key fields. The expanded fields are introduced at the parent record's top level, named by joining the original field name and the target's primary-key field name with `_`. For the example above, master `B`'s effective record shape is: ``` B.id : int B.aRecord_id : int B.aRecord_name: string ``` The expansion is observable at every consumer of the schema: - **CSV import**: source files must provide one column per expanded field. The example expects `id, aRecord_id, aRecord_name`. - **Code generation**: the target language emits one field per expanded entry on the generated record type, and emits a per-field Join builder that resolves the ref's expanded primary-key columns against the target master's relation through `FindBy` (see [Join Operator](schema.md#join-operator)). The original `ref` field name does not appear as a single nested field at any consumer in the current revision. ## Post-Filter Collections in Validation The same post-filter record collection that source programs reach through `Master.toList()` also appears as the `table` binding (and its alias `self`) of a master's [validation](validation.md) `all` rule, iterated with `for row in table`. The only source-level relation terminal is `toList()`; `findBy` and the other terminals stay codegen/runtime-level. The relation **stage** operators (`where`, `orderBy`, `thenBy`, `skip`, `take`) are reachable from source only inside a [scope body](query.md#source-level-relation-queries), where a validation `all` rule may also call the master's scopes on `table`. ## Reserved Keywords The identifier `ref` is not reserved: it is matched only when written as a generic type argument application `ref`. Using `ref` as an ordinary identifier elsewhere remains allowed. # Master Data Validation Source: https://masterbelt.dev/spec-src/masterdata/validation.md # Master Data Validation A master's `validation` section declares data quality checks that run over the master's records after import and filtering. Unlike a [filter](schema.md#filter-section), a validator never drops a record: it inspects the post-filter dataset and emits diagnostics. Validation runs during [`masterbelt export`](../tooling/cli.md) before any artifact is written; an error-severity validation failure blocks the entire export. ## Surface Form The validation section is an optional [master body section](schema.md#body-sections). It contains one or more scope groups; each group contains one or more named `validate` rules whose bodies assert conditions: ```mst master Records { record { primary ID: int, Name: string, Value: int } validation { each { validate nameRequired { assert row.Name != "" } validate valuePositive { assert row.Value > 0 } } all { validate checkValueSum { let total = 0 for row in table { total = total + row.Value } assert total < 1000 } } } } ``` - `validate` takes a **stable identifier**, not a message string. The identifier names the validator in project configuration (see [Severity Configuration](#severity-configuration)). It must be unique within a master across both `each` and `all` groups. - A single `validate` block may contain several `assert` statements. Each failed `assert` produces one diagnostic. - A validation block needs no `return`. It is a statement block that "passes" when it runs to completion. `return` inside a validation block is rejected (`masterbelt.checker.return_in_validation`). - `assert` is the validation primitive. It is only valid inside a validation block; the checker rejects `assert` elsewhere (`masterbelt.checker.assert_outside_validation`). ## Scopes ### `each` An `each` group runs every rule once per final record. The current record is bound to two implicit names: - `row` — the record. - `self` — an alias for the same record. Both have the master's record type. A rule that fails on one record continues to the next; a failed `assert` inside a rule does not stop later statements or asserts in the same rule. ### `all` An `all` group runs every rule once per master over the whole post-filter record collection. The collection is bound to two implicit names: - `table` — the post-filter relation. - `self` — an alias for the same relation. Both have type `Relation` for the surrounding master `M`. A rule iterates the collection with `for row in table` (or `for row in self`) — iterating a relation yields its post-filter records in plan order — and may reach other masters' post-filter records through [`Master.toList()`](query.md#iteration). Because the binding is a relation, an `all` rule may also apply the master's [scopes](schema.md#scope-section) and the relation stage operators to `table` or `self`. ## Execution Semantics Validation runs in a deterministic order: 1. Every source record is imported. 2. Each master's [filter](schema.md#filter-section) is applied. 3. The final record set for each master is built. 4. Validators run in module and source declaration order. `each` rules run in source order, preserving the post-filter record order; `all` rules run in source order. 5. Diagnostics are returned. A validation rule body runs in the evaluator, not in generated code: validators are a build-time contract over the data, and no validation code is emitted into any target language. ## Failure Severity A failed `assert` produces a `masterbelt.validation.assert_failed` diagnostic. Its severity defaults to **error**. Project configuration can override the severity per `(master, validator)` pair to `warning`; see [Severity Configuration](#severity-configuration). - An error-severity failure blocks the export: no artifact is written. - A warning-severity failure is reported but does not block the export. Records are never removed by validation: the failing record is preserved in the export (when the export proceeds). An evaluation error inside a validation rule (an unbound reference, a runtime type error, a division by zero) is a hard error attributed to the rule, surfaced through the underlying evaluator diagnostic or wrapped as `masterbelt.validation.evaluation_failed`. ## Severity Configuration Project configuration keys severity overrides by the **entrypoint-visible master path** and validator ID. See [tooling/configuration.md](../tooling/configuration.md#validators) for the schema. The path is the master as it is visible from the entry module — an aliased import uses the alias, and a nested master uses its dotted path — never the flattened codegen name. ```yaml validators: Records: nameRequired: warning valuePositive: error U.Friendships: uniquePair: warning ``` A master is only validated when it is reachable from the entry module: - A master declared in the entry module, or re-exported from it by a single `pub` import, is **visible** and validated under its entrypoint path. - A master that the entry module neither declares nor re-exports is **out of scope**: it has no config-visible name, so its validators do not run. - A master re-exported from the entry under more than one name is **ambiguous**: no single config path identifies it, so its validators do not run and the ambiguity is reported as `masterbelt.validation.ambiguous_master`, which blocks the export. (Two distinct masters re-exported under the *same* name are rejected earlier as `masterbelt.resolver.duplicate_name`.) Only `error` and `warning` are accepted. Configuration is validated before validators run, so a typo is visible even when a master imported zero records: - A master path that matches no master is `masterbelt.validation.config_unknown_master`. - A validator ID that matches no rule under a known master is `masterbelt.validation.config_unknown_validator`. - A severity outside `error` / `warning` is `masterbelt.validation.config_invalid_severity`. Each of these config diagnostics blocks the export. ## Diagnostics | Code | Severity | Meaning | | --- | --- | --- | | `masterbelt.validation.assert_failed` | configured (default error) | An `assert` condition evaluated false. The span is the condition expression; args carry the master path, validator ID, scope, record description, and condition source text. | | `masterbelt.validation.evaluation_failed` | error | A validation rule body raised an evaluation error. | | `masterbelt.validation.config_unknown_master` | error | A `validators` config key names a master that does not exist. | | `masterbelt.validation.config_unknown_validator` | error | A `validators` config key names a validator that does not exist on a known master. | | `masterbelt.validation.config_invalid_severity` | error | A `validators` severity is not `error` or `warning`. | | `masterbelt.validation.ambiguous_master` | error | A master is re-exported from the entry under more than one name, so no config path identifies it unambiguously. | For an `each` failure, the record is described by its primary key (the same convention used by [filter exclusion diagnostics](schema.md#filter-section)); for an `all` failure, the record description is `
`. ## Future Work The MVP does not implement PowerAssert-style display of sub-expression values, custom validation messages, the `info` / `hint` severities, or target-language runtime validators. The evaluator boundary already retains each failed assertion's expression and span so PowerAssert-style reporting can be added without a surface change. # Master Data Query Source: https://masterbelt.dev/spec-src/masterdata/query.md # Master Data Query This document defines the cross-target query model exposed over master data collections. The user-visible model is uniform across every target; per-target call shape, naming, and threading slots are documented in [codegen/golang.md](../codegen/golang.md), [codegen/typescript.md](../codegen/typescript.md), and [codegen/csharp.md](../codegen/csharp.md). ## Relation Every master decomposes at the codegen level into a **Record** (the value type for one row) and a **Relation** (the chainable query surface). The relation is the only user-facing model for querying records. A relation is a data-less, immutable value: it carries the source-level master identity plus an ordered list of staged operations (predicates, orderings, skip, take). It does not own records. The active record set is supplied at terminal execution time through a [`MasterData`](schema.md#runtime-model) value resolved by the per-target mechanism (Go: `From(ctx)`; C#: `data` parameter; TypeScript: `data` parameter). Every relation is reached through a per-master entrypoint named after the master: | Target | Entrypoint | | --- | --- | | Go | a package-level value `Items` of type `ItemsRelation` | | C# | a static class field `Items` of type `ItemsRelation` | | TypeScript | a module-level constant `items` of type `ItemsRelation` | Authoring a query never names `MasterData` directly. The user starts from the entrypoint, applies zero or more stages, and finishes with a terminal that resolves the data and returns the result. ```mst for item in Items.toList() { ... } ``` In the source program, master iteration uses the surface method `toList()` (see [Iteration](#iteration)). Codegen lowers the call onto the per-target relation terminal that returns the materialised record list. ## Query Plan The relation carries an inspectable **QueryPlan** that records: - the source master identifier (so a backend can dispatch the plan onto its own table); - an ordered list of predicates appended by `Where`; - an ordered list of orderings — the first set by `OrderBy`, subsequent ones appended by `ThenBy`; - a non-negative skip count, defaulting to zero; - a take limit, where a negative value means unlimited. The plan is part of the runtime model; the source program never names it. The runtime types that back the plan are: - `QueryPlan[R]` (or the target equivalent) — the plan value type, parametrised by the record type R. - `Predicate[R]` — an interface backed by exported concrete struct types (`EqPredicate`, `NePredicate`, `LtPredicate`, `LePredicate`, `GtPredicate`, `GePredicate`, `InPredicate`, `BetweenPredicate`, `BoolEqPredicate`, `BoolNePredicate`, `BoolInPredicate`, `AndPredicate`, `OrPredicate`, `NotPredicate`). Each struct exposes its operator-relevant metadata (`FieldRef`, comparison value, operands) as named fields so a backend can translate the node to SQL without invoking the per-record accessor. - `Ordering[R]` — an interface backed by `AscOrdering` and `DescOrdering` carrying the field reference and direction. - `FieldRef` — a structural pointer to a record field by its source-level name. Backends translate `FieldRef.Name` into their column naming convention. - Per-master field handles (`OrderedField[R, V]` / `BoolField[R]`) — typed constructors that produce predicate and ordering nodes against a stable field reference. The structural shape is identical across targets. Concrete language-level names may differ (`IPredicate` on C#, `Predicate` on TypeScript and Go) but the operator vocabulary and field-metadata contract is the same. The in-memory executor is one consumer of this plan; future backends (indexed JSON, SQLite) are additional consumers of the same plan. The plan stays structurally inspectable so a backend can translate predicates to SQL without ever evaluating the per-record accessor closures the in-memory executor uses. The user program does not see the plan unless it deliberately type-switches on the runtime nodes for testing or debugging. Generated public types are relation-shaped; the internal plan types are reserved for runtime / backend use. ## Stage Operations Stage operations build up the query plan. Every stage returns a new relation; the receiver is never mutated. | Operation | Effect on the plan | | --- | --- | | `Where(predicate)` | Appends `predicate` to the predicate list. Multiple `Where` calls accumulate as a conjunction. | | `OrderBy(ordering)` | Replaces the ordering list with `[ordering]`. The first `OrderBy` wins for the primary sort key; tie-breakers compose through `ThenBy`. | | `ThenBy(ordering)` | Appends `ordering` to the existing ordering list; the existing keys win, the appended key breaks ties. | | `Skip(n)` | Sets the skip count to `n`. Applied after predicates and ordering. | | `Take(n)` | Sets the take count to `n`. Applied after `Skip`. `Take(0)` yields an empty result. A negative value is the per-target "no limit" sentinel. | | `Select()` | Switches to a projected relation. The source-side plan flows into the projection; predicates and orderings added after the call are typed on the projected record. See [Projections](#projections). | | `Join(right)` | Switches to a joined relation. Each surviving source record is matched against the supplied right relation via primary-key lookup; the call returns a pair relation. See [Joins](#joins). | Stage operations are pure: they record intent but do not read the dataset. They synchronous and return immediately. ### Copy-on-write Relations are copy-on-write. A base relation can be shared across independent chains without aliasing: ```go base := Items.Where(ItemsFields.Count.Ge(10)) a := base.Take(1) b := base.Take(2) ``` `a`, `b`, and `base` are independent values: each terminal sees only the stages explicitly chained onto its receiver, and stages applied to one do not affect the others. The same property holds across every target. ## Terminal Operations Terminal operations resolve the active dataset, execute the plan, and return a result. Every terminal carries the `asyncable`, `cancellable`, and `failable` effects so backends that can fail surface the failure through the per-target failable transport (Go: trailing `error`; TypeScript: thrown exception; C#: thrown exception). | Operation | Result | | --- | --- | | `ToSlice` / `ToList` / `toArray` | The full filtered, sorted, skipped, taken record sequence as a list. This is the canonical list terminal; an unfiltered relation returns every record. | | `Iter` / `AsAsyncEnumerable` | A streaming iterator over the same sequence (Go: `iter.Seq2[Record, error]`; C#: `IAsyncEnumerable`). TypeScript exposes `toArray` only; an iterator API may follow in a later phase. | | `FindBy` / `findBy` | Primary-key lookup. Applies the relation's predicate chain to the matched record so a filter chained before `FindBy` excludes records that would otherwise match the key. Skip / Take / OrderBy stages do not affect `FindBy`. Returns the per-target "no match" sentinel when no record satisfies both the key and the filter chain. | | `FirstOrDefault` / `firstOrDefault` | The first record of the same sequence as `ToSlice`, paired with the per-target no-match sentinel (Go `(record, false, nil)`; TS `undefined`; C# `null`). | | `Count` / `count` | The cardinality of the sequence as an integer. | | `Any` / `any` | Boolean: true when the sequence is non-empty. | Terminal execution is observational: it does not re-import data, does not mutate the relation, and observes the same record set across calls within one run. The legacy `All` / `all` terminal is no longer part of the documented API. The list terminal (`ToSlice` / `ToList` / `toArray`) is the only documented way to materialise every record. ### Iteration Iterator Semantics The iterator terminals consume the same plan as the list terminal and yield records one at a time: - The iterator stops cleanly after the last record. - A backend failure surfaces as one (zero record, error) pair before the iterator stops, so the loop body can observe and propagate the error. - A relation whose plan resolves successfully but produces no records yields zero pairs and stops. The in-memory backend materialises the result before yielding; future backends are free to stream record by record as long as the iterator's external contract holds. ### FindBy and Filters `FindBy` honours every `Where` predicate already chained onto the relation but ignores `OrderBy`, `ThenBy`, `Skip`, and `Take` because they cannot change which row is identified by the primary key. The lookup proceeds as: 1. Scan the master's records for a row whose primary-key fields all equal the supplied arguments. 2. If no row matches, return the per-target no-match sentinel. 3. Apply the relation's predicate list to the matched row. If any predicate fails, return the per-target no-match sentinel. 4. Otherwise, return the matched row. Projected and joined relations do not expose `FindBy`: their output records do not have a primary-key concept. ## Field Builder Predicates and orderings are constructed through a per-master typed handle that exposes one entry per supported record field. - Go uses a package-level `Fields` variable: `ItemsFields.Count.Ge(10)`. - C# passes a callback typed on the same `Fields` class: `Items.Where(item => item.Count.Ge(10))`. - TypeScript passes a callback typed on the same `Fields` interface: `items.where(item => item.count.ge(10))`. A field handle is not a record value. Its purpose is to construct predicate and ordering nodes that carry the field reference, the operator name, and the operand value. The operator vocabulary, uniform across targets: | Method | Applies to | Builds | | --- | --- | --- | | `Eq(value)` / `eq(value)` | any handle | ` == ` | | `Ne(value)` / `ne(value)` | any handle | ` != ` | | `Lt(value)` / `lt(value)` | ordered handles only | ` < ` | | `Le(value)` / `le(value)` | ordered handles only | ` <= ` | | `Gt(value)` / `gt(value)` | ordered handles only | ` > ` | | `Ge(value)` / `ge(value)` | ordered handles only | ` >= ` | | `In(values...)` / `in(values...)` | any handle | ` ∈ {values...}` | | `Between(low, high)` / `between(low, high)` | ordered handles only | inclusive range `low <= <= high` | | `Asc()` / `asc()` | ordered handles only | ascending ordering | | `Desc()` / `desc()` | ordered handles only | descending ordering | A bool field handle exposes only `Eq`, `Ne`, and `In`; calling an ordering or range method on a bool handle is a compile-time error because bool is comparable but not ordered. The combinators `And`, `Or`, `Not` (or their per-target equivalents `Predicates.And` etc.) compose predicates within a single record type. Mixing record types across a combinator is a compile-time error. ## Source-Level Relation Queries The stage operators above are codegen-/runtime-level for direct use: a Masterbelt source program cannot write `Items.Where(...)` at an arbitrary call site, and the only relation terminal reachable from source is `toList()`. The exception is a master's [scope section](schema.md#scope-section): inside a scope body the relation **stage** operators are available as source-level methods on the master's `Relation`, so a scope can name a reusable query fragment that the rest of the program applies by calling the scope. `Relation` is the source-level type of the query surface for master `M`. It is a data-less, immutable, copy-on-write value carrying the staged plan described under [Query Plan](#query-plan); a scope receives one as `self`, threads stages onto it, and returns one. The master's surface name is the base relation entrypoint, so `Records` denotes the base `Relation` from which `Records.adult()` and `Records.gendered(1)` start. `Relation` is distinct from `Record` and from `list>`: a record is one row, a list is a materialised sequence, and a relation is the unresolved query. ### Source-Level Stage Operators A scope body reaches these stage methods on `Relation`. Each returns a fresh `Relation`; the receiver is never mutated. | Method | Effect on the plan | | --- | --- | | `self.where(fn(row) => )` | Appends the predicate. Multiple `where` calls accumulate as a conjunction. | | `self.orderBy(fn(row) => )` | Sets the primary ordering key, replacing any existing ordering list. | | `self.thenBy(fn(row) => )` | Appends a tie-breaker ordering key. | | `self.skip(n)` | Sets the skip count. | | `self.take(n)` | Sets the take limit. | The user-declared scopes of `M` surface as additional methods on `Relation` (see [schema.md#calling-and-chaining](schema.md#scope-section)). The source-level surface deliberately omits the terminal operators (`toList`, `findBy`, `count`, …) and the projection / join switches: those carry effects or change the record type, neither of which a scope may do. `skip` and `take` keep the same runtime/codegen semantics as the corresponding plan stages. ### Query Callbacks `where`, `orderBy`, and `thenBy` take a callback written with the ordinary `fn(row) => …` surface syntax, but the checker and lowering treat the callback as a **query predicate / ordering DSL**, not as an ordinary function value: - `row` is a field-handle view of the record `M`, not a record value. `row.` resolves to the per-field handle for that field; a reference to a field the record does not declare is reported as `masterbelt.checker.scope_unknown_field`. - A `where` callback must produce a predicate; an `orderBy` / `thenBy` callback must produce an ordering. - The callback may reference scope parameters and module-level user functions / statics / consts. Such references are runtime parameters of the resulting predicate. A referenced function / static / const that carries `failable` / `cancellable` / `asyncable` is reported as `masterbelt.checker.scope_forbidden_effect`; the callback itself never inherits those effects. - No new callback syntax is introduced — the `fn(...) => ...` spelling is reused. ### Field-Handle Operators `row.` exposes the same operator vocabulary as the codegen [Field Builder](#field-builder), using the lowercase source spelling: | Method | Applies to | Builds | | --- | --- | --- | | `eq(value)` / `ne(value)` | any handle | equality / inequality predicate | | `lt` / `le` / `gt` / `ge` `(value)` | ordered handles only | comparison predicate | | `in(values...)` | any handle | membership predicate | | `between(low, high)` | ordered handles only | inclusive range predicate | | `asc()` / `desc()` | ordered handles only | ascending / descending ordering | A bool handle exposes only `eq`, `ne`, and `in`; an ordering or range method on a bool handle is a compile-time error. The comparison method names are exactly `eq` / `ne` / `lt` / `le` / `gt` / `ge` / `in` / `between` — `gteq` is not used. Predicates compose through the function-form combinators `and(a, b)` / `or(a, b)` / `not(a)`. ```mst scope youngAdults(maxAge: int) { return self .where(fn(row) => and(row.age.ge(20), row.age.le(maxAge))) .orderBy(fn(row) => row.age.asc()) .thenBy(fn(row) => row.name.asc()) } ``` ### Evaluation Model A scope is a lazy query-plan builder. A scope call does not scan records: it returns a relation that carries the accumulated plan. The records are touched only when a terminal (such as `toList()`) resolves the relation against the active dataset, and they are always the master's **final, post-filter** records — a scope never runs against source/pre-filter records. A scope chain inlines into a single plan: `self.adult().gendered(g)` is equivalent to applying `adult`'s stages then `gendered`'s stages to one relation. See [evaluation.md](../language/evaluation.md#scope-evaluation). ### Validation Interaction A master's [validation](validation.md) `all` rule binds the post-filter collection to `table` (and its alias `self`). That binding is a relation, so an `all` rule may call the master's scopes on `table` (or `self`). A validation `each` rule binds `row` to a `Record`, which has no relation surface, so scopes are not callable from an `each` rule. Scopes remain unavailable from filter rule bodies; a master static body may call scopes through the relation entrypoint. ## Projections A `select Name { ... }` section ([schema.md#select-section](schema.md#select-section)) emits a projected relation type alongside the source relation. The chain `Items.Select()` returns a relation typed on the projected record; `Where`, `OrderBy`, `ThenBy`, `Skip`, `Take` on the projected relation operate on the projected record type. Projection runs at terminal time after the source-side plan has filtered, sorted, skipped, and taken the source records. The runtime copies the named fields into a fresh projected record for each surviving source record, then applies the projected plan to the resulting list. The relation's underlying record set is never reshaped — projection is a view that materialises on each terminal call. Projected relations expose the same terminals as source relations except for `FindBy`. They do not redeclare the input record's primary-key columns, so a primary-key lookup is not defined. ## Joins A `ref` field on a master's record emits a joined relation alongside the source relation ([schema.md#join-operator](schema.md#join-operator)). The chain `B.Join(A)` returns a relation typed on a `(Left, Right)` pair record where `Left` is the source record and `Right` is the target record. Phase 4 implements **INNER JOIN** semantics: a left record whose ref does not resolve against the right relation is dropped from the pair sequence. The right relation is supplied at the call site as a relation value (typically the package-level relation for the target master). Source-side stages flow into the join: `B.Where(...).OrderBy(...).JoinARecord(A)` runs the source-side filter / sort first, then for each surviving left record looks up the matched right record through the right relation's primary-key lookup. Pair-level stages added after the `Join` call apply to the pair sequence. Joined relations expose the same terminals as source relations except for `FindBy`. ## Iteration The Masterbelt surface program iterates master records through the `toList()` method: ```mst for item in Items.toList() { use(item.id, item.name) } ``` `Items.toList()` is the source-level surface form. Codegen lowers the call onto each target's relation list terminal (Go `Items.ToSlice(ctx)`; C# `Items.ToList(data, cancellationToken)`; TypeScript `items.toArray(data, signal)`). When the iteration appears inside the owning master's own static body, the receiver is the surrounding relation method's `r` parameter so a filter chained onto `r` before reaching the iteration is honoured. The method carries the `asyncable`, `cancellable`, and `failable` effects per [Effect Inheritance](../language/types.md#effect-inheritance); the source program never has to acknowledge the inheritance. Iteration over a master is observational: a subsequent call returns the same record set as the first call within one run. The legacy `all()` method name is no longer supported. Source programs migrating from the previous API must rename `Master.all()` to `Master.toList()`. A master's [validation](validation.md) `all` rule binds the master's post-filter record collection to `table` (and to its alias `self`). The rule iterates it the same way, with `for row in table`, which is equivalent to iterating `table.toList()`; a rule that needs another master's post-filter records reaches them through `Master.toList()`. The only source-level relation **terminal** is `toList()`: `findBy` and the other terminals stay codegen/runtime-level. The relation **stage** operators (`where`, `orderBy`, `thenBy`, `skip`, `take`) are reachable from source only inside a [scope body](#source-level-relation-queries), where they build a relation plan rather than execute it. ## Backend Abstraction The runtime separates the plan from its executor. Today the only executor is in-memory: it resolves records through `MasterData`, evaluates each predicate's accessor closure against every candidate row, sorts the survivors, and applies skip / take. The plan is intentionally structural so a non-memory backend can consume it without executing the in-memory accessor closures. A SQLite backend (a deferred phase) would: - inspect `QueryPlan.Source.Name` to route to the corresponding table; - walk the predicate AST and translate each concrete node (`EqPredicate{Field, Value}`, `BetweenPredicate{Field, Low, High}`, `AndPredicate{Operands}`, ...) into SQL; - walk the ordering AST into `ORDER BY` clauses; - emit `OFFSET` / `LIMIT` from `Skip` / `Take`; - execute the SQL and materialise the result into the record type. The public API stays unchanged: a backend swap is invisible to the source program. Generated relation types and terminals do not depend on the executor; they construct plans and invoke the executor through a per-target dispatch seam. This document does not specify the SQLite backend. The plan shape exists today so the future backend can land without breaking existing source programs. # CSV Import Source: https://masterbelt.dev/spec-src/masterdata/import-csv.md # CSV Import This document defines CSV import behavior for master data. ## Source Form A CSV source entry appears inside a master's source section. Its surface form is the source entry form defined in [schema.md](schema.md): ```mst master Items { record { primary id: int, name: string } source { csv "data/items.csv" csv "data/items.en.csv" { separator: ",", } } } ``` The source-kind identifier is `csv`. The string literal that follows names the CSV file. Path resolution is defined in [../tooling/configuration.md](../tooling/configuration.md). ## Options The option clause is optional. The defined keys are: - `separator` — `string`. The single-character field separator. Default: `","`. Validation rules: - An option name not listed above is reported as `masterbelt.checker.master_source_option_unknown`. - An option value whose type does not match the declared type above is reported as `masterbelt.checker.master_source_option_type_mismatch`. The full set of option keys and the per-importer parsing rules will be extended by future updates to this specification before adding new behavior. # XLSX Import Source: https://masterbelt.dev/spec-src/masterdata/import-xlsx.md # XLSX Import This document will define XLSX import behavior for master data. # JSON Export Source: https://masterbelt.dev/spec-src/masterdata/export-json.md # JSON Export The JSON exporter serialises an entire project's imported master data into one JSON document. Its on-disk shape is the **canonical Masterbelt export format** consumed by every codegen target's `LoadJSON`-shaped helper. ## Kind The export-kind identifier is `json`. ## Configuration A JSON export is configured under `exports:` in the project configuration file: ```yaml exports: - kind: json out: data/masterdata.json ``` - `kind` selects this exporter. - `out` is the file system path the exporter writes to. Relative paths resolve against the project root. The exporter creates any missing parent directories. No `options` keys are recognised at this stage; unknown keys are silently ignored to leave room for future extensions. ## Document Shape The exporter writes one JSON object per run. Top-level keys are the masters' **flat camelCased identifiers** (the same name the per-target MasterData accessor uses). Values are arrays of record objects in importer-supplied order: ```json { "items": [ {"count": 10, "id": 1, "name": "alpha"}, {"count": 20, "id": 2, "name": "beta"} ], "userFriendships": [ {"friend": 9, "owner": 7} ] } ``` - Top-level keys appear in the project's master-declaration order. A master that declared no source section appears with an empty array. - Per-record keys appear in lexicographic order so two records with the same field set always produce byte-identical encodings regardless of producer iteration order. - The exporter ends the file with a trailing newline. A nested master `master User { master Friendships { ... } }` appears at the top level under its flattened-then-camelCased name (`userFriendships`) — no nested object layout is used. ## Value Encoding | Masterbelt value | JSON encoding | | --- | --- | | `null` | `null` | | `bool` | `true` / `false` | | `int` whose abs value < 2^53 | JSON number (decimal digits) | | `int` whose abs value >= 2^53 | JSON string (quoted decimal digits) | | `string` | JSON string with standard escapes | | `list` | JSON array, elements recursed | | `map` | JSON object keyed by rendered key, entries sorted by key | | nested product | JSON object, keys sorted lexicographically | The safe-integer guard at 2^53 keeps the document round-trippable through JavaScript consumers. Loaders that promote those values back to a numeric type are free to detect the quoted form and parse it through a big-integer constructor; the exporter never emits a `bigint` JavaScript literal because plain JSON does not have one. ## Field Names Per-record keys are the master's surface field names as written in Masterbelt source. Names are emitted verbatim; no case transformation is applied at the field level. A `ref` field expands to the target master's primary-key fields under the surrounding field's name joined with `_` (`field_pk1`, `field_pk2`, ...) before serialisation. ## Determinism Two runs against the same input produce byte-identical output: - Top-level keys appear in master-declaration order. - Per-record keys are sorted lexicographically inside each object. - Map entries are sorted by their rendered key. - List elements preserve the producer's order (the producer's order is itself deterministic for every importer defined in [import-csv.md](import-csv.md) and friends). ## Loader Contract Every codegen target produces a `LoadJSON` / `loadJSON` / `LoadJson` helper that consumes this format and returns a fully wired `MasterData`. The per-target signatures and casing are defined in: - [../codegen/golang.md](../codegen/golang.md#master-data) - [../codegen/typescript.md](../codegen/typescript.md#master-data) - [../codegen/csharp.md](../codegen/csharp.md#master-data) The format is the contract between the exporter and the loaders; changes to the document shape are coordinated through all three target specifications. # SQLite Export Source: https://masterbelt.dev/spec-src/masterdata/export-sqlite.md # SQLite Export The SQLite exporter serialises an entire project's imported master data into one SQLite database file. The database is a self-describing, queryable artifact that scales to data sets the JSON exporter would inflate beyond comfortable distribution sizes. ## Kind The export-kind identifier is `sqlite`. ## Configuration A SQLite export is configured under `exports:` in the project configuration file: ```yaml exports: - kind: sqlite out: data/masterdata.db ``` - `kind` selects this exporter. - `out` is the file system path the exporter writes to. Relative paths resolve against the project root. The exporter creates any missing parent directories. If the file already exists it is overwritten. No `options` keys are recognised at this stage; unknown keys are silently ignored to leave room for future extensions. ## Database Shape The exporter writes one SQLite database per run. The database contains: - One table per master, named with the master's **flat camelCased identifier** (the same name the per-target MasterData accessor and the JSON exporter's top-level key use). - One metadata table, `_masterbelt_meta`, that records the export's format version and the producing tool's identity. A nested master `master User { master Friendships { ... } }` appears as a single top-level table `userFriendships` — no nested table layout is used. Every table is created with `STRICT` so SQLite enforces declared column affinities at insert time. Records are inserted in importer-supplied order. ## Table Layout For each master the exporter emits a `CREATE TABLE` statement whose columns mirror the master's record fields: - Column names are the master's surface field names as written in Masterbelt source. Names are emitted verbatim with no case transformation. - A `ref` field expands to the target master's primary-key columns under the surrounding field's name joined with `_` (`field_pk1`, `field_pk2`, ...) — the same expansion the JSON exporter applies. - A primary key declared by `primary` on the master is materialised as a SQL `PRIMARY KEY` clause. Composite keys appear as `PRIMARY KEY (col1, col2, ...)` in the order declared on the master. - The implicit primary-key index aside, the only secondary indexes are those inferred from a master's [`indexed scope`](#secondary-indexes-from-indexed-scopes) declarations. ### Column Types Masterbelt scalar types map to SQLite column affinities as follows: | Masterbelt value | SQLite type | | --- | --- | | `bool` | `INTEGER` (stored as `0` / `1`) | | `int`, `int8`, `int16`, `int32`, `int64`, `uint8`, `uint16`, `uint32`, `uint64` | `INTEGER` | | `string` | `TEXT` | Phase 1 does not infer `NOT NULL` constraints. Every column is created without `NOT NULL` and accepts `NULL` so a value that cannot be represented (out-of-range integer, future nullable value) can be stored as `NULL` while the rest of the row still inserts. A future revision will tie `NOT NULL` emission to a richer schema-side nullability model once one exists; until then, loaders must not rely on the database to reject `NULL` values. Composite-shaped fields (`list`, `map`, nested products that did not flatten through `ref`) do not have a Phase 1 SQLite representation. The exporter omits these fields from the master's table entirely and reports `masterbelt.exporter.sqlite.value_unsupported` once per affected field. The master's remaining primitive columns are still emitted and populated so the rest of the database remains usable. An integer value whose magnitude does not fit in SQLite's 8-byte signed `INTEGER` range produces the same `masterbelt.exporter.sqlite.value_unsupported` diagnostic for the offending row and is stored as `NULL` so the surrounding rows still load. ### Secondary Indexes from Indexed Scopes A master's [`indexed scope`](schema.md#indexed-scopes) declarations drive secondary-index generation. The exporter inlines each indexed scope's relation plan (its `where` predicates and order-by stages, with chained scopes expanded in place) and infers `CREATE INDEX` statements from it. The inference is purely structural: it reads the lowered query plan, never executes it, and only the SQLite backend consults it — other targets ignore `indexed` without a diagnostic. #### Inferable Plan Shapes These plan fragments contribute index columns: - Equality predicates (`eq`), including `bool` equality. - Range predicates (`lt`, `le`, `gt`, `ge`). - `between` and `in` predicates. - `orderBy` / `thenBy` orderings, including descending order (emitted as `DESC` in the index). - `not(p)` contributes the field(s) of its inner predicate. - `or(a, b)` over distinct fields contributes those fields; a more complex `or` falls into partial success (below). `skip` and `take` are ignored for inference. Collation and null ordering are out of scope. The exporter generates neither expression indexes nor unique indexes, and never emits a partial (`WHERE`) index — a literal or parameter predicate produces an ordinary index. #### Column Ordering When a single scope mixes equality, range, and order-by usage, the composite index orders columns **equality columns → range column → order-by columns**. Multiple equality predicates order by record field declaration order. Multiple `where` calls merge as a conjunction before inference. A chained scope (`genderedAdult(g) => self.adult().gendered(g)`) inlines its stages, so `gender == $g and age >= 20 order by name` yields the composite index `(gender, age, name)`, and a bare `age >= 20` yields `(age)`. #### Generation, Naming, and Deduplication - One scope may yield more than one index. - An inferred index identical to the master's primary key is not generated. - Identical indexes inferred from multiple scopes are deduplicated. - An index is named `idx_
_`. When one scope yields several indexes, the second and later carry a numeric suffix starting at `2`: `idx_
_`, `idx_
__2`, `idx_
__3`. If the resulting name still collides with an already-emitted index name, a `_` disambiguating suffix is appended. - The DDL is `CREATE INDEX ON
();` with no `IF NOT EXISTS` — the export artifact is created fresh on every run. #### Diagnostics - Generating an index emits `masterbelt.scope.index_generated` (info) with the index name. - An `indexed scope` that cannot be turned into an index — or only partially — emits `masterbelt.scope.index_inference_failed` (warning); any inferable part is still generated. A scope parameter that is never used in an indexable predicate or ordering is not itself an error. ### Metadata Table The metadata table is created as: ```sql CREATE TABLE _masterbelt_meta ( key TEXT PRIMARY KEY, value TEXT ) STRICT; ``` The exporter populates the following keys on every run: | Key | Value | | --- | --- | | `format` | The literal string `masterbelt.sqlite`. | | `format_version` | The SQLite export format version. Phase 1 emits `1`. | | `masterbelt_version` | The producing tool's release identifier, or `dev` for unstamped builds. | | `created_at` | The export's wall-clock time as an RFC 3339 timestamp in UTC. | `format_version` is reserved for breaking changes to the SQLite export layout itself. Additive schema changes (new metadata keys, additional indexes, additional columns) do not bump the version; consumers must ignore keys they do not recognise. ## Determinism Two runs against the same input and the same producer version produce byte-identical row contents: - Tables are created in master-declaration order. - Records are inserted in importer-supplied order (already deterministic for every importer defined in [import-csv.md](import-csv.md) and friends). - Composite primary keys are written column-by-column in the order declared on the master. The `created_at` metadata entry intentionally records the run's wall-clock time and is therefore not byte-deterministic across runs. Tools that need a byte-deterministic database can post-process the metadata table; the rest of the database remains stable across identical inputs. ## Loader Contract The SQLite export is the input format for the SQL-storage code generation modes documented in: - [../codegen/golang.md](../codegen/golang.md#master-data) - [../codegen/typescript.md](../codegen/typescript.md#master-data) - [../codegen/csharp.md](../codegen/csharp.md#master-data) A generated runtime does not validate the database's full schema at startup. Missing tables or columns surface as ordinary SQL errors when the affected query runs. Unknown extra tables, columns, indexes, or metadata keys are silently tolerated so additive schema evolution does not break older runtimes. ## Diagnostics The exporter emits the following diagnostic codes: - `masterbelt.exporter.sqlite.open_failed` — opening or creating the output database failed. - `masterbelt.exporter.sqlite.exec_failed` — executing a `CREATE TABLE` or `INSERT` statement failed. - `masterbelt.exporter.sqlite.value_unsupported` — a Masterbelt value shape cannot be represented as a SQLite column value (composite fields, oversized integers, ...). The offending row is still inserted with the affected column left `NULL`. [Secondary-index inference](#secondary-indexes-from-indexed-scopes) additionally emits `masterbelt.scope.index_generated` (info) and `masterbelt.scope.index_inference_failed` (warning). # MessagePack Export Source: https://masterbelt.dev/spec-src/masterdata/export-msgpack.md # MessagePack Export This document will define MessagePack export behavior for master data. # Code Generation Model Source: https://masterbelt.dev/spec-src/codegen/model.md # Code Generation Model Code generation turns a Masterbelt program into source files in one or more target languages. This document defines the user-visible parts of that pipeline shared by every target language. ## Inputs and Outputs Code generation consumes the Masterbelt program after type checking has completed and lowering has produced normalized program modules. Each Masterbelt source file contributes one program module to the generation input. Type declarations contribute to generation when the target language requires a native form for them (for example, a Go struct type for a product-typed declaration); otherwise their substitution has already been applied to types reaching this stage. For each configured target, code generation produces a set of output files. Each output file is identified by a relative path. The path is relative to the target's configured output root. A generation invocation reports diagnostics emitted by the targets it ran. A target that reports any error severity diagnostic is treated as a failed target and no further files for that target are written. ## Targets A target represents one code generation output. A target has: - A kind identifier such as `golang`, `typescript`, or `csharp`. The kind selects which generation behavior is used. - An output root directory. Generated file paths are interpreted relative to this directory. - An options object whose contents are interpreted only by the target whose kind it carries. The shape of the options object is defined by each target's own specification. Multiple targets of the same kind are allowed; they generate independently with their own output roots and options. A target's generation is deterministic with respect to its inputs and options. The same input program and the same options produce the same files. ## Output Files Every generated file is identified by: - A path relative to the target's output root, using forward slashes as separators. - A byte content that is the file's exact contents. A target generates a self-consistent file set: every file required to use the generated code is included. Targets do not produce loose fragments that the user must combine. How the file set maps to language constructs (one file per module, multiple files, header/runtime split) is defined by each target's own specification. ## Overwrite Behavior Generated files always overwrite any file at the same path. Code generation is not incremental: each invocation writes the complete file set for each target. ## Target Selection The set of targets to run is configured in the project configuration. The project configuration also carries each target's output root and options. See `tooling/configuration` for the schema. Tools that run code generation either run every configured target or the explicitly selected subset of them. Selecting a target that is not configured is a usage error. ## Symbol References and Imports Generated code refers to two kinds of symbols: symbols declared by the same generation invocation, and symbols from outside the generated package such as standard library entries or hand-written runtime helpers. The generation model treats every reference as a pair of (origin, name), where origin is either the local package being generated or an external module identifier whose meaning is target-specific. Targets do not concatenate raw import paths into rendered source. Targets express references as symbols, and the rendering layer is responsible for: - Producing the language's import declaration block for every external origin reached during rendering, exactly once. - Choosing a unique import alias when two external origins share the same default alias. - Rewriting symbol references in the rendered source to use the chosen alias. This responsibility lives in the rendering layer because the set of imports cannot be determined until every reference has been visited. The rendering layer is also where conflict-free aliasing is decided. Targets that bypass the rendering layer and inline import strings forfeit this guarantee. The user-visible consequence is that the generated file's import block is always minimal (no unused imports) and conflict-free (no two import declarations bind the same alias to different origins). ## Effects and Propagation A callable symbol may declare one or more effects. The currently defined effects are: - `cancellable`: the symbol participates in cooperative cancellation. The caller passes a cancellation token; the symbol must observe it. - `failable`: the symbol may report a recoverable failure to its caller. The caller must observe and propagate or handle the failure. - `asyncable`: the symbol completes in the future. The caller observes a deferred completion value rather than a direct result. Effects are an open set; future effects may be added by extending this specification and every target's mapping. Effects propagate from callee to caller. A caller that invokes a symbol carrying any effect inherits that effect unless the caller explicitly absorbs it (for example, by handling a failable result without re-raising). The generation model relies on the IR to record the effect set for every callable so targets can render effect-aware signatures without re-deriving them. A target translates each effect to its language's native idiom. Each target document defines those mappings concretely; a target that cannot represent an effect must reject programs that require it, not generate code that silently drops the obligation. ## Reachability and Public Symbols The default generation policy emits only those symbols reachable from at least one symbol declared with the `pub` modifier in the source program. A non-public symbol that is not referenced, directly or transitively, by any public symbol is dead code at the generation boundary and MUST NOT appear in the generated output. Reachability is computed as follows: - Every public symbol is a root. - A symbol referenced by an already-reachable symbol becomes reachable. - The set of reachable symbols is the transitive closure of these rules. The default produces output that mirrors the program's public surface area: public symbols are always emitted, helpers used to express them are emitted under their non-public name, and helpers that nothing public depends on are dropped. This avoids leaking dead identifiers into generated source and avoids unnecessary churn when private helpers are renamed or removed. A target MAY override the default with a documented reason; the default applies otherwise. ## Extending With New Targets Adding a new target kind requires its own specification document under `codegen/` that defines: - The kind identifier. - The supported options. - The mapping from program modules to output files. - The mapping from Masterbelt types and values to target language constructs. - The mapping from each effect to the target language's idiom. - The runtime requirements, if any. Targets share the same generation model and use the same output-file, symbol, and effect abstractions, but their per-language mappings are independent. # Code Generation Runtime Source: https://masterbelt.dev/spec-src/codegen/runtime.md # Code Generation Runtime This document will define runtime requirements for generated code. # Go Code Generation Source: https://masterbelt.dev/spec-src/codegen/golang.md # Go Code Generation This document defines the Go code generation target. ## Kind The target kind identifier is `golang`. ## Options The Go target reads the following options from its `options` mapping in the project configuration: - `package: STRING` is the Go package name used in every generated file. Required. The value is used verbatim as the package clause and must be a valid Go package identifier. - `storage: STRING` selects the master-data backend baked into the generated package. Optional; defaults to `memory`. Accepted values are `memory` (the in-memory executor that consumes records supplied through `NewMasterData` or `LoadJSON`) and `sql` (a SQL-backed executor that translates queries into SQL against a host-supplied database connection — see [Master Data](#master-data)). A missing or empty `package` is a configuration error. A `storage` value other than `memory` or `sql` is a configuration error. The Go target does not consume any other options at this stage; unknown options are silently ignored to leave room for future extension without breaking existing configurations. ## File Set For a project whose lowered IR modules are `m1.mst`, `m2.mst`, ..., the Go target produces: - One generated file per Masterbelt module. The file name is the module name with the `.mst` suffix replaced by `.go`. Each file contains the Go declarations corresponding to that module's constants. - A `masterbelt_unions.go` file when any constant has a union type. The file declares the sealed interface and member wrapper types for every union encountered across the project. The file is omitted when no union types are used. - A `masterbelt_masterdata.go` file when the project declares at least one master. The file declares the `MasterData` struct, the `NewMasterData` constructor, and the `With` / `From` context helpers described in [Master Data](#master-data). The file is omitted when the project declares no master. - A `masterbelt_query.go` file when the project declares at least one master. The file declares the `QueryPlan` value type, the `Predicate` / `Ordering` interfaces, the exported concrete predicate / ordering node structs that back the inspectable AST, the generic field-handle types (`OrderedField[R, V]`, `BoolField[R]`), the combinators (`And`, `Or`, `Not`), and the in-memory executor consumed by every generated relation terminal; see [Master Data](#master-data). The file is omitted when the project declares no master. All files share the same Go package and live under the configured output root. The Go target does not emit a separate runtime file at this stage. ## Type Declarations Each Masterbelt type declaration emits one Go type declaration. Non-product targets use the Go type-alias form so the declared name is a transparent name for the resolved body: ```go type LocalName[Params] = MappedTargetType ``` Product targets use a defined type so a named, addressable receiver type is available for methods and serializer hooks: ```go type LocalName[Params] struct { ... } ``` When the declaration carries type parameters, the emitted Go declaration carries the same parameters as a Go type parameter list (`[T any, ...]`). All parameters are emitted with the `any` constraint at this stage. Visibility follows the source program: a `pub`-declared name is emitted under its public Go identifier, while a non-public name keeps the lowercase form per the visibility rule. At a use site, a generic declaration is rendered with its type arguments: `LocalName[T1, ...]`. A product literal whose declared type is generic is emitted with type arguments on the literal type as well, for example `Container[int]{Value: 1}`. ### Cross-Module References All modules in a project share one Go package. A reference to a symbol declared in another Masterbelt module emits the bare Go identifier of that symbol; no Go import statement is required and no module qualifier is added to the call site. Because every module lives in the same Go package, every top-level identifier across the project must be unique under its Go mapping. The target emits no per-module prefix; collisions are reported as a generation diagnostic. ### Re-exports A `pub { ForeignName as LocalName } from "..."` declaration emits a forwarding `var` in the current module's Go file: ```go var LocalName = ForeignName ``` The forwarding `var` makes the local name visible alongside the foreign one inside the shared Go package. Go's type inference keeps the declaration's source type identical to the foreign symbol's type. ### Reserved file name prefix Files invented by the Go target itself, rather than derived from a Masterbelt source file name, use the reserved `masterbelt_` prefix. This keeps generator-managed files in a separate name space from user-named modules so that a Masterbelt file such as `unions.mst` does not collide with the framework's union catalog. New generator-managed files added to this target MUST use the same prefix. ## Package Clause Each generated file begins with `package ` where `` is the configured package option. ## Reachability The Go target follows the default reachability policy defined in `codegen/model`: only constants reachable from at least one `pub`-declared constant are emitted. Non-public constants that are not referenced by any reachable constant are dropped from the output, including their identifier in any generated file. Identifier references resolve against the surviving set, so a `pub const A = helper` declaration keeps `helper` even though `helper` itself is not public. ## Visibility A Masterbelt const item carries a public flag from its `pub` modifier. Each Masterbelt identifier is mapped to a Go identifier whose first letter is upper case for public constants and lower case for non-public constants. The remainder of the identifier is preserved as written. A constant whose Masterbelt identifier starts with a non-letter character (such as `_`) is not currently supported and is a generation error. ## Constants and Variables Each Masterbelt const item maps to one Go declaration: - A `const` declaration is used when the constant's type is `bool`, `string`, or any built-in numeric type (`int`, `int8`, ..., `uint64`) and the lowered expression is a corresponding literal. - A `var` declaration is used for every other case: `null` typed constants, list and map literals, union typed constants, and references to other constants. The declared Go type follows the type mapping table below. The right-hand side is the lowered expression rendered as a Go expression. Doc comment lines from the source are emitted as `// ` lines immediately before the declaration in source order, with one space inserted between `//` and the original text when the original text has no leading space. ## Type Mapping | Masterbelt | Go | |-------------------|--------------------------------------------------------------------| | `null` | `any` (the value `null` lowers to the Go untyped `nil`) | | `bool` | `bool` | | `int` / `uint` | `int` / `uint` (Go's natural integer width) | | `int8` / `uint8` | `int8` / `uint8` | | `int16` / `uint16`| `int16` / `uint16` | | `int32` / `uint32`| `int32` / `uint32` | | `int64` / `uint64`| `int64` / `uint64` | | `string` | `string` | | `list` | `[]T'` where `T'` is the Go mapping of `T` | | `map` | `map[K']V'` where `K'` and `V'` are the Go mappings | | `T1 \| T2 \| ...` | Sealed interface generated into `unions.go` (see Unions below) | | `{f: T, ...}` | Defined Go struct type when reached through an alias declaration; see Product Types below | | `fn(p: T, ...): R`| Go function type `func(p T, ...) R` (see Function Types below) | | `enum Name { ... }`| Defined integer type `type Name storage` plus one `const Variant Name = value` per variant; see Enums below | Type declarations are resolved before mapping; declared types do not appear in generated code. ## Literal Mapping - `null` lowers to the Go untyped `nil`. - `true` and `false` lower to the Go literals `true` and `false`. - An integer literal lowers to its decoded value rendered in base 10. - A string literal lowers to a Go double-quoted string with Go escape sequences. - A list literal `[e1, ..., eN]` of type `list` lowers to `[]T'{e1', ..., eN'}`. - A map literal of type `map` lowers to `map[K']V'{k1: v1, ...}` with entries emitted in lowering order after last-wins deduplication. - An identifier reference lowers to the referent constant's mapped Go identifier. - A product literal of type `Item` lowers to a Go struct literal `Item{Name: ..., ...}`. Field initializers preserve the source order of the literal; field name keys are upper-cased so they are reachable from any package. ## Master Data A `master Foo { record { ... } static { ... } }` declaration follows the runtime model defined in [../masterdata/schema.md](../masterdata/schema.md#runtime-model). The Go target emits the following declarations per master: - `Record` — the Go struct that backs one row. Field naming, modifiers, and constructor / getter generation follow the regular [Product Types](#product-types) rules; the only difference is the type name suffix. - `Relation` — a Go defined struct type that exposes the master's query surface. The relation is a data-less value type carrying only a `QueryPlan[Record]` field; it does not own records. Records are reached through `MasterData` at terminal time via the per-master accessor `Executor(*MasterData) Executor[Record]` emitted in the same file. The accessor wraps the records the host supplied to `NewMasterData` (or `LoadJSON`) as a `memoryExecutor` and returns a never-nil `Executor` so terminals can call `Execute` / `FindByPK` unconditionally. - `` — a package-level `var` of type `Relation` initialised with an empty query plan. Authoring a query starts from this value (`Items.Where(...).ToSlice(ctx)`), never from `data.Items`. A `MasterData` accessor for the master is intentionally absent from the public API: the data is reached implicitly through `From(ctx)`. - Chainable stages and terminals on `Relation`: copy-on-write `Where`, `OrderBy`, `ThenBy`, `Skip`, `Take`, plus the terminals `ToSlice`, `Iter`, `FindBy`, `FirstOrDefault`, `Count`, `Any`. The full method shapes appear under [Query API](#query-api). Both types live as siblings at file scope. The Masterbelt source identifier `` no longer requires a `data.` accessor: the package-level relation value carries the same name. Nested masters follow the same naming scheme on the flattened identifier — `master User { master Friendships { ... } }` emits `UserFriendshipsRecord`, `UserFriendshipsRelation`, and a package-level `UserFriendships` value as siblings of the parent. See [Nested Masters](../masterdata/schema.md#nested-masters). ### MasterData Entry Every project that declares at least one master emits one generator-managed file `masterbelt_masterdata.go` carrying the project-wide dataset entry. The file declares: - `MasterData` — a struct with one unexported records slice per master in the project (`Records []Record`). MasterData owns the records, not the relations. The per-master accessor (`Executor(*MasterData) Executor[Record]`) emitted next to each relation reads this field and wraps it as a `memoryExecutor` so the terminal call site never sees the storage choice; a future `storage: sql` configuration will change the accessor body to return a SQL-backed executor while keeping the call site stable. - `NewMasterData( []Record, []Record, ...) *MasterData` — a positional constructor that takes one record slice per master in master-declaration order and returns a fully wired `*MasterData`. - `With(ctx context.Context, data *MasterData) context.Context` — attaches a `*MasterData` to a context so generated terminals reached through that context can resolve the active records. - `From(ctx context.Context) *MasterData` — retrieves the `*MasterData` previously attached with `With`, returning `nil` when no value was attached. The Go target never writes `MasterData`, `With`, `From`, or `NewMasterData` in code emitted from a Masterbelt source program: those identifiers exist for the host application to construct and inject the dataset. Generated terminals consult `From(ctx)` to resolve the active records; the public chainable API never names `MasterData` directly. When `storage: sql` is configured, the layout above changes: - `MasterData` carries a single `db SQLDB` field shared by every per-master executor; the per-master records slices are not emitted. - The positional `NewMasterData(...)` constructor is replaced by `NewSQLMasterData(ctx context.Context, db SQLDB) (*MasterData, error)` so the host application supplies a connection rather than record slices. The constructor stores the supplied `SQLDB` and returns; per-master executors fetch rows on demand through the connection. - `With` and `From` are unchanged. - `LoadJSON` is not emitted under `storage: sql`. Hosts that want to seed a SQL backend from the JSON exporter should round-trip the data through the SQLite exporter (see [../masterdata/export-sqlite.md](../masterdata/export-sqlite.md)) and open the resulting database. The `SQLDB` interface, the matching `SQLRows` / `SQLRow` row interfaces, and the `translatePlan` / `translatePKLookup` helpers live in `masterbelt_query.go`. A host built on `database/sql` wraps `*sql.DB` in a small adapter that satisfies `SQLDB`; alternative SQLite bindings can do the same without dragging a particular driver into the generated package. The adapter is a handful of lines because `*sql.DB.QueryContext` returns a `*sql.Rows` rather than the generated `SQLRows`: ```go type sqlDBAdapter struct{ db *sql.DB } func (a sqlDBAdapter) QueryContext(ctx context.Context, query string, args ...any) (masters.SQLRows, error) { rows, err := a.db.QueryContext(ctx, query, args...) if err != nil { return nil, err } return rows, nil // *sql.Rows already satisfies SQLRows } func (a sqlDBAdapter) QueryRowContext(ctx context.Context, query string, args ...any) masters.SQLRow { return a.db.QueryRowContext(ctx, query, args...) // *sql.Row already satisfies SQLRow } // db, _ := sql.Open("sqlite", "masterdata.db") // host registers its own driver // data, _ := masters.NewSQLMasterData(ctx, sqlDBAdapter{db: db}) ``` `*sql.Rows` and `*sql.Row` already satisfy `SQLRows` / `SQLRow` structurally, so the only wrapping needed is the `QueryContext` return type. The host registers a SQLite driver (`modernc.org/sqlite`, `github.com/mattn/go-sqlite3`, ...) and owns the `*sql.DB` open/close lifetime; the generated code never imports a driver. Under `storage: memory` the same file also declares `LoadJSON(data []byte) (*MasterData, error)`, a helper that unmarshals the JSON document produced by the JSON exporter (see [../masterdata/export-json.md](../masterdata/export-json.md)) into a fresh `*MasterData`. The function uses `encoding/json` only and is independent of any backend library: ```go func LoadJSON(data []byte) (*MasterData, error) { var raw struct { Items []ItemsRecord `json:"items"` UserFriendships []UserFriendshipsRecord `json:"userFriendships"` // ... one per master in declaration order } if err := json.Unmarshal(data, &raw); err != nil { return nil, err } return NewMasterData(raw.Items, raw.UserFriendships /* ... */), nil } ``` Each generated `Record` struct carries a `json:""` tag on every field so the inner record objects round-trip with the surface field names as JSON keys. The surface name is the master's source-level field identifier verbatim (`id`, `name`, `userId`) without any case transformation. A `ref` field expands to the underlying primary-key fields under the surrounding field's name joined with `_` (`field_pk1`, `field_pk2`, ...); the JSON tag on each expanded leaf carries the joined source name. Other product types not declared inside a `master` block do not receive JSON tags. ### Static Body Rewrites A user-declared static method's body is rewritten so the planner-side master references resolve against the package-level relation values and the threaded dataset: - `Master.toList()` inside the owning master's own static body lowers to `r.ToSlice(ctx)` against the receiver of the surrounding relation method. The receiver is the relation value the static method was invoked on, so a caller that chains stages onto `r` before invoking the static observes those stages. - `Master.X` (any user-declared static constant or method) inside the same owning master's body lowers to `r.X(ctx)` for a constant or `r.X(ctx, ...)` for a method. - `OtherMaster.toList()` (a cross-master reference) lowers to `.ToSlice(ctx)` against the package-level relation value. - `OtherMaster.X` (any other cross-master reference) lowers to `.X(ctx, ...)` against the package-level relation value. The receiver of the surrounding relation method is named `r` so a chained owner-self reference never collides with the `self` receiver used by record-attached methods. ### Top-Level Dataset Threading A top-level function or product-type method that transitively reaches any master static member (constant or method, including the built-in `toList()`) implicitly acquires the dataset and threads it through every call that needs it. The Go target piggy-backs on the existing cancellable inheritance machinery: a function that reaches a master is treated as effectively cancellable, so it receives the same `ctx context.Context` first parameter the cancellable transform adds and forwards it through call sites that also became effectively cancellable. Effects already in scope (`failable`, `asyncable`) compose with the synthesized cancellable transform without further interaction. The Masterbelt source program never declares `cancellable` for the purpose of master access; the codegen-side inference is invisible at the surface. ### Query API Every master emits the chainable surface directly on `Relation`. Predicates and orderings are constructed by methods on a package-level field handle; there is no separate `Query` type in the public API. See [../masterdata/query.md](../masterdata/query.md) for the cross-target contract and [../masterdata/schema.md](../masterdata/schema.md#query-api) for how it ties into the master-data schema. The per-master declarations are: - `Fields` — a package-level variable that holds one typed field handle per supported record field. The variable is exported regardless of the master's visibility because the field handles are the call-site authoring surface. The backing struct type is unexported (`FieldsStruct`); users reach the field handles through the variable (`ItemsFields.Category.Eq("weapon")`) and never name the struct directly. Fields whose Go type is not one of the supported primitives (numeric, string, bool) are omitted from the field builder silently; the user-facing query API still emits for the rest of the surface. - Stage methods on `Relation` (value receivers): `Where(predicate Predicate[Record]) Relation`, `OrderBy(ordering Ordering[Record]) Relation`, `ThenBy(ordering Ordering[Record]) Relation`, `Skip(n int) Relation`, `Take(n int) Relation`. Each method returns a freshly allocated relation whose plan field extends the receiver's plan; the receiver is never mutated, so a base relation can be shared across independent chains. - Terminal methods on `Relation`: `ToSlice(ctx context.Context) ([]Record, error)`, `Iter(ctx context.Context) iter.Seq2[Record, error]`, `FindBy(ctx context.Context, k1 T1, ...) (Record, bool, error)`, `FirstOrDefault(ctx context.Context) (Record, bool, error)`, `Count(ctx context.Context) (int, error)`, `Any(ctx context.Context) (bool, error)`. Every terminal carries the same three effects as `toList()` / `FindBy`; the Go target renders the `cancellable` slot (`ctx`) and the `failable` slot (trailing `error`) explicitly. `Iter` returns `iter.Seq2[Record, error]` from the standard `iter` package so user code consumes the relation with `for record, err := range Items.Where(...).Iter(ctx) { ... }`. The iterator consumes the same plan as `ToSlice`; backends that can stream a particular operation are free to do so, but the public iterator contract is stable. `FindBy` honours the relation's `Where` predicates: a primary-key match that fails the filter chain returns the zero record with ok=false. `OrderBy`, `ThenBy`, `Skip`, and `Take` do not affect `FindBy`. The `QueryPlan[R]` value type, the `Predicate[R]` / `Ordering[R]` interfaces, the exported concrete node structs (`EqPredicate`, `LtPredicate`, `BetweenPredicate`, `AndPredicate`, `AscOrdering`, ...), the generic field-handle types `OrderedField[R any, V cmp.Ordered]` and `BoolField[R any]`, the combinators `And(...)`, `Or(...)`, `Not(...)`, the `Executor[R]` seam consumed by every terminal, the `memoryExecutor[R]` in-memory implementation, and the iterator helpers `iterMemoryPlan` / `iterMemoryError` / `iterMaterialised` all live in the generator-managed `masterbelt_query.go` file. Backends can inspect a relation's plan structurally — every concrete node carries its field reference and operand values — without invoking the closures the in-memory evaluator uses. ```go weapons, err := masterdata.Items. Where(masterdata.And( masterdata.ItemsFields.Category.Eq("weapon"), masterdata.ItemsFields.Level.Ge(10), )). OrderBy(masterdata.ItemsFields.SortOrder.Asc()). Take(10). ToSlice(ctx) ``` ### Scope Methods Each `pub` [scope](../masterdata/schema.md#scope-section) on a master emits an exported method on `Relation`; a non-`pub` scope is internal to Masterbelt source and is not emitted. The method name is the source scope name in PascalCase (`genderedAdult` → `GenderedAdult`). Its parameters are the scope's declared parameters mapped through the regular [Type Mapping](#type-mapping); it takes no `ctx` and no `error` because a scope is effect-free. ```go func (r Relation) () Relation ``` The method returns a `Relation`: it extends the receiver's plan with the scope body's stages (with chained scopes inlined) and returns a fresh relation, exactly like the built-in stage methods, so a scope composes with `Where` / `OrderBy` and with other scopes (`masterdata.Records.Adult().Gendered(1)`). The method is backend-independent — it builds a plan and is identical under `storage: memory` and `storage: sql`. A call to a non-`pub` sibling scope is inlined into the method body (with its parameters substituted), because only `pub` scopes are emitted as methods; a call to a `pub` sibling scope is a method call. An `indexed` scope adds no Go surface beyond what its `pub` flag implies; it only influences SQLite index generation. ### Select Projections Each `select Name { ... }` section on a master ([../masterdata/schema.md](../masterdata/schema.md#select-section)) emits a parallel set of declarations alongside the source relation: - `Record` — a Go defined struct type carrying the projected fields. Field order matches the order written in the select body; field types are copied from the master's record by name. - `Fields` — the package-level field builder for the projected record, with the same shape as `Fields` but parametrised on the projected record. - `Relation` — the projected relation type. It carries the source relation by value plus its own `QueryPlan[Record]`. Its stage and terminal methods mirror the source relation's surface, parametrised on the projected record type. Stage methods accept `Predicate[Record]` / `Ordering[Record]`; terminals carry the same `(ctx context.Context, ...) (..., error)` shape as the source relation's terminals. - `Select() Relation` — a method on `Relation` that returns a fresh projected relation capturing the receiver's source-side plan. Terminals on the projected relation first apply the source-side plan against the master's record slice, then project each surviving record into a `Record` by copying the named fields, and finally apply the projected plan to the projected slice. Predicates and orderings added through the projected relation are typed on the projected record, so authoring `Where(ItemsSummaryFields.Name.Eq("alpha"))` against an `ItemsSummaryRelation` is a compile-time match; passing a source-record predicate to the projected relation is a compile-time error. Projected relations do not expose `FindBy`: the projected record does not have a primary-key concept. ```go summaries, err := Items. Where(ItemsFields.Count.Ge(10)). SelectSummary(). OrderBy(ItemsSummaryFields.Name.Asc()). ToSlice(ctx) ``` ### Join Operator Each `ref` field on a master's record ([../masterdata/relations.md](../masterdata/relations.md)) emits a parallel set of declarations alongside the source relation: - `JoinPair` — a Go defined struct type with two exported fields, `Left Record` and `Right Record`. The pair is the join's record-side aggregate. - `JoinFields` — a package-level field builder for the pair record. The variable's struct exposes nested `Left` and `Right` sub-builders, each carrying one typed field handle per supported primitive on the corresponding side (including the expanded ref-field columns on the left). Each handle's accessor reads `pair.Left.` or `pair.Right.`. - `JoinRelation` — the joined relation type. It carries the source relation by value, the right relation supplied at the call site, and its own `QueryPlan[JoinPair]`. Its stage and terminal methods mirror the source relation's surface, parametrised on the pair record. Stage methods accept `Predicate[JoinPair]` / `Ordering[JoinPair]`; terminals carry the same `(ctx context.Context, ...) (..., error)` shape as the source relation's terminals. - `Join(right Relation) JoinRelation` — a method on `Relation` that returns a fresh joined relation capturing the receiver as the source side, the supplied relation as the right side, and a fresh pair-level plan. Terminals on the joined relation first call `ToSlice(ctx)` on the source relation (so source-side state applies first), then iterate the surviving left records and call `right.FindBy(ctx, leftRecord._, ...)` for each, emitting a `JoinPair{Left: ..., Right: ...}` on a successful match and dropping the row on a non-match (INNER JOIN; `LEFT` / `RIGHT` / `FULL OUTER` deferred). Pair-level state (predicates, orderings, skip, take) then applies to the pair slice before the terminal returns. Joined relations do not expose `FindBy`: the pair record does not have a primary-key concept. ```go pairs, err := B. JoinARecord(A). Where(BJoinARecordFields.Right.Name.Eq("alpha")). OrderBy(BJoinARecordFields.Left.Id.Asc()). ToSlice(ctx) ``` ## Product Types A Masterbelt type declaration whose target is a product type emits a defined Go struct type rather than the type-alias form used for other targets: ```go type Item struct { Name string Count int } ``` The field order matches the field order written in source. A defined type is used so methods and serializer hooks can be attached to the struct receiver idiomatically. Each field's Go visibility follows its Masterbelt modifier. A field with no modifier (or with the explicit `writable` modifier) is emitted as an exported field with a Title-cased name; a `readonly` field is emitted as an unexported field with a lower-cased name. Mutable fields stay directly assignable; readonly fields can only be set at construction. When any field of a product declaration carries `readonly`, the target also emits: - A `NewName[P](field1 T1, ...) Name[P]` constructor that takes every field as a positional parameter in source order and assigns it to the corresponding struct field. The constructor is the only way external callers can populate the unexported readonly fields. - A value-receiver getter method per readonly field. The getter's name is the Title-cased form of the field's Masterbelt identifier and its return type is the field's mapped Go type. At a use site, a product literal whose declared type carries any readonly field lowers to a `NewName(...)` constructor call with arguments rearranged into source order. Without any readonly field the literal continues to lower to the existing struct literal form `Name{Field: ...}`. A `readonly` field whose generated getter name would collide with another field's exported Go name (for example, a readonly field `name` whose getter `Name()` would shadow a mutable field `Name`) is a generation error reported with `masterbelt.codegen.golang.field_modifier_collision`. A product literal `Item { name: ..., count: ... }` lowers to a Go struct literal whose type is the alias name and whose field initializers use the upper-cased field names. ## Function Types A Masterbelt function type lowers to a Go function type `func(name T, ...) R`. A type declaration whose body is a function type emits a Go type-alias declaration following the rule in [Type Declarations](#type-declarations): ```go type BinaryOp = func(left int, right int) int type Mapper[T any, U any] = func(value T) U type Summer = func(initial int, values ...int) int ``` Parameter names are preserved from source. A variadic parameter prefixed with `*` in source lowers to a Go variadic parameter prefixed with `...`; the element type is the parameter's declared type. A variadic parameter is permitted only as the last parameter (the language rule defined in [language/types.md](../language/types.md) is enforced before generation). Effects defined in [Effects](#effects) shape the rendered signature: - `cancellable` inserts a `context.Context` parameter at the beginning of the parameter list, before any declared parameters. - `failable` appends an `error` result, yielding a `(R, error)` return tuple where `R` is the declared return type. - `asyncable` does not change the rendered signature; see [Effects](#effects) for the rationale. A function type that combines multiple effects applies every transformation listed above. ## Enums A Masterbelt enum lowers to a Go defined integer type plus one `const` declaration per variant. The defined-type form (no `=`) preserves the enum's nominal identity in Go: the enum type is not interchangeable with its storage type without an explicit `Storage(variant)` conversion at the call site. ```go type Status int8 const Active Status = 0 const Inactive Status = 1 ``` Variant identifiers are emitted at package scope under their source names. A variant whose value was explicit in source uses that value; a variant without an explicit value uses 0 for the first variant and the previous variant's value plus one thereafter (the language rule defined in [language/types.md](../language/types.md)). A member access expression `Enum.Variant` lowers to the bare Go identifier for the variant. The surface-level dot is not preserved because each variant is already accessible as a package-scope identifier; the type system carries the enum type through the call site. ## Functions and Methods A top-level function declaration emits a Go `func Name(params) Result { ... }` at package scope. The function name is rendered with the C# convention's capitalization rule: public functions become exported PascalCase identifiers; non-public functions stay lowerCamelCase. Effect modifiers reshape the signature per the [Effects](#effects) section. A method declared inside a product type emits a Go receiver function `func (self Owner) Method(params) Result { ... }` against the owning struct type. The receiver is named `self` in the emitted Go so a Masterbelt method body that references the implicit `self` keyword maps one-to-one onto the Go receiver. Methods that share a name (overloaded methods) are disambiguated by a 1-based numeric suffix: the first overload keeps the original name, the next is `Method1`, then `Method2`, and so on. Call sites carry the same suffix so the dispatch is decided at the call site by the checker's overload resolution. A call expression `target(args)` emits the Go call form `target(args)`. A method call `value.method(args)` emits `value.Method(args)` (with the overload suffix when applicable). A function literal `fn(params): R { ... }` emits a Go function literal `func(params) R { ... }` inline at its surrounding expression position. A return statement emits the Go `return` statement, optionally followed by the value expression. ## For Statements A Masterbelt for statement lowers to a Go `for` statement. The IR subject shape selects the form: - `list` subject — `for _, name := range list { ... }`. A skipped value binding (`_` in source) is rendered as Go's `_`. - `map` subject — `for k, v := range m { ... }`. Either binding renders as `_` when skipped. - `range(start, end)` subject — `for i := start; i < end; i++ { ... }` with `i` named after the source binding. A `_` binding still requires a loop variable; the Go target synthesizes a local named `__mbI` whose only purpose is to advance the counter. A `break` statement lowers to the Go `break` keyword; a `continue` statement lowers to the Go `continue` keyword. The Go compiler enforces the rule that both must appear inside a `for` body; the Masterbelt checker has already established that, so the generated code passes the check trivially. The lowered subject expression is evaluated exactly once (Go's `range` and counted-for forms both have a single evaluation point), matching the language's once-only semantics. Iteration over a `Master.toList()` subject lowers through the [Master Data](#master-data) rewrite: `Items.ToSlice(ctx)` against the package-level relation for a cross-master subject and `r.ToSlice(ctx)` when the iteration appears inside the owning master's own static body. Because `toList()` is `failable`, the surrounding callable inherits the same effect (the Go signature gains a trailing `error` result) and the for-loop subject is lifted into a `__mbRecords, __mbErr := ` short variable declaration plus the standard failable propagation guard before the `for _, x := range __mbRecords { ... }` loop — same shape the [Failable Handling](#failable-handling) section defines for `let x = `. ## Master Static Members A master's static section ([../masterdata/schema.md](../masterdata/schema.md#static-section)) emits its members on the master's `Relation` type (see [Master Data](#master-data)): - A `static const Name: T = value` emits as a value-receiver method `func (r *Relation) Name() T { return value }`. The constant body has the same dataset-rewriting rules as a method body so a constant initializer that calls another master's static member resolves through `From(ctx).Other.X(ctx)`. - A `static fn name(params): R { body }` emits as a value-receiver method `func (r *Relation) Name(ctx context.Context, params) R { body }`. The first parameter is always `ctx context.Context`; the rest of the parameter list and the return type follow the regular function-type mapping including the effect-driven signature transforms (`failable`, `asyncable`) described in [Effects](#effects). A `pub` static member emits a Title-cased Go method name; a non-public member emits a lower-cased name. Visibility is independent from the master's own `pub` modifier. The source-level access `Items.X` inside any callable's body lowers per the [Master Data](#master-data) rewrite rules: the owner-self case resolves to `r.X`, and the cross-master case resolves to `From(ctx)..X` (with `(ctx, args)` when called). ## Match Statements A Masterbelt match statement lowers to one of three Go forms depending on the subject's resolved type: - **Union subject** lowers to a Go type switch on the lowered subject. Each arm becomes one `case` whose case type is the Go form of the arm's pattern type (the wrapper type for a non-null union member, or `nil` for the union's null member). Bindings introduced by the pattern are emitted as local variables in the case body; a type pattern binding extracts the wrapper's `Value` field for non-null members, and uses the subject directly for the `nil` case. A product pattern emits field accesses on the matched value to populate its short-form field bindings. The narrowing of the original subject identifier (when the subject is a plain identifier) emits an additional `name := ` assignment at the top of the case body using the same narrowed expression. When the match is statically exhaustive no `default` case is emitted; a wildcard arm becomes a `default` case. - **Enum or literal subject** lowers to a Go value switch on the lowered subject. Each arm becomes one `case` whose case expression list is the arm's enum-pattern or literal-pattern alternatives. A wildcard arm becomes the `default` case. - **Subject with mixed arm kinds** (for example a union of primitives where some arms are literals and some are type patterns) lowers to an `if`/`else if` chain. Each arm's condition combines a type-assertion check (for type and product patterns) with literal equality checks; bindings are introduced inside the `if` body using the same extraction rules described above. Guards lower as an `if` check wrapping the arm body. When a guarded arm's guard evaluates to `false`, control falls through the type switch (Go's `case` does not fall through by default, so the synthesized fallthrough is achieved by structuring guarded arms as `if`/`else if` chains nested inside the case). The Go target emits no `panic` or `default` arm to absorb unhandled cases when the checker has proven exhaustiveness. A wildcard arm explicitly written in source lowers to `default:` (for switch forms) or a trailing `else { ... }` (for the if-chain form). A match statement is otherwise emitted at its source position inside the surrounding Go function body and shares the function's local scope. The lowered subject is evaluated exactly once at the head of the switch (or assigned to a fresh local at the top of the if-chain). ## Operator Expressions The unary and binary operator expressions defined in [language/syntax.md](../language/syntax.md) reach the Go target as method calls on the operand's type (see [language/builtins.md](../language/builtins.md)). When the receiver is a built-in primitive type (or an alias whose target resolves to a primitive), the Go target emits the call as the corresponding native Go operator instead of a method invocation: - Numeric operands emit `+ - * / % == != < <= > >= & | ^ << >>` directly. - `bool` operands emit `&& || == !=`; the bitwise-shaped methods `and`/`or`/`xor` on `bool` lower to `&& || !=` respectively. Unary `not` emits `!`. - `string` operands emit `+` for `add` and `== != < <= > >=` for the comparisons. - Unary `plus`, `minus`, and `not` emit `+`, `-`, and `!`. A user product type that declares one of the operator method names continues to lower through the regular method call path: the call site emits `receiver.Method(args)` (with the overload suffix when applicable). The native-operator rewrite applies only when the receiver's resolved type is a primitive. ### Built-in Field Accesses The fields defined in [language/builtins.md](../language/builtins.md) on built-in primitive and generic types lower to native Go expressions: - `string.length` lowers to `utf8.RuneCountInString(receiver)` (with the `unicode/utf8` import added automatically). The result is the number of Unicode codepoints in the string, matching the spec semantics for `string.length`. - `list.size` and `map.size` lower to `len(receiver)`. The Go `len` built-in returns the slice length and the map entry count respectively, both of which agree with the spec's element / entry count semantics. No runtime helper is emitted for these fields; the lowering inlines the call at the access site. ### Built-in Generic Operator Methods `list.add(other)` and `map.add(other)` defined in [language/builtins.md](../language/builtins.md) lower to a call into a runtime helper that the Go target writes alongside the generated module files. The helper file is named `masterbelt_runtime.go` (the same reserved `masterbelt_` prefix used by the union file) and is emitted only when at least one generated module references a helper from it. The helper file declares the package functions used at call sites: - `masterbeltListAdd[T any](a, b []T) []T` returns a fresh slice containing the elements of `a` followed by the elements of `b`. - `masterbeltMapAdd[K comparable, V any](a, b map[K]V) map[K]V` returns a fresh map containing every entry of `a` and every entry of `b`, with keys present in both taking the value from `b`. Call sites for `list.add` emit `masterbeltListAdd(a, b)`; call sites for `map.add` emit `masterbeltMapAdd(a, b)`. The helpers live in the same Go package as the rest of the generated code, so the call site uses the bare identifier without any import qualifier. ## Unions A union type lowers to a Go sealed interface. For a canonical union `T1 | T2 | ... | TN`: - The interface type is named by concatenating the title-cased member type names with `Or`. For primitive members, the title-cased names are `Null`, `Bool`, `Int`, `String`. The order matches the canonical order recorded in the IR, which is the lexicographic order of each member's textual type spelling. - The interface declares one unexported method whose name is `is` so only generated types satisfy it. - For each non-null member type, the file declares a Go wrapper type named `` with a single exported `Value` field of the Go type for that member, and implements `is` on the wrapper. - The `null` member does not get a wrapper type. The Go untyped `nil` is the zero value of any interface and serves as the null member's representation directly; an interface variable holding `nil` compares equal to `nil`. A value of a union type lowers as a wrapper struct literal `{Value: }` for non-null members, and as the Go literal `nil` for null. At this stage, only unions of primitive types are supported. A union containing a generic member is a generation error. ## Effects Each effect defined in `codegen/model` maps to Go as follows: - `cancellable` adds a `context.Context` parameter as the first parameter of the callable. - `failable` adds a trailing `error` return value to the callable. - `asyncable` has no idiomatic Go signature transform at this stage; a function type carrying `asyncable` is rendered as if the effect were absent. The lack of mapping is a known limitation: Go has no language-level future/promise type, and the target reserves an explicit choice (such as introducing a runtime channel helper) for a later design pass. Every effect is inferred along the call graph: the Go target computes an effective effect set per callable by walking from the declared effects and propagating through every transitive call site to a fixed point. A function whose effective set differs from its declared set still renders with the inferred shape. A callable that carries multiple effective effects combines all applicable transformations on its signature; the parameter ordering is `context.Context` first, then declared parameters; the return ordering is declared results followed by `error`. A caller invokes the resulting Go callable with the inherited context and propagates the returned error using ordinary Go control flow. The Go target does not invent helper macros or wrappers around the call site; effect propagation is recorded by the effective-effect inference and applied during signature rendering and call-site lowering. ## Failable Handling A function whose effective effect set contains `failable` carries an `error` second result in its Go signature (see [Effects](#effects)). The effective set is the union of the declared effects and the effects of every function the body transitively calls; a non-failable declaration whose body calls a `failable` function still renders as `(T, error)` in Go. The surface program never has to acknowledge this transport (see [language/semantics.md](../language/semantics.md#failable-handling)); the Go target plumbs it transparently: - `fail "message"` emits `return zero, errors.New("message")` where `zero` is the success-typed zero value (`*new(R)`). - `fail err` where `err` is an `Error` value emits `return zero, errors.New(err.Message)`. - A call whose callee is effectively failable, inside a body that is also effectively failable, is rewritten to receive both results and to short-circuit on a non-nil error. For a binding `let x = f()` the emission is `x, __mbErr := f(); if __mbErr != nil { return zero, __mbErr }`; for `return f()` it is the analogous tuple form. The user never writes the guard. A match expression cannot observe the Error path of a `failable` call subject because the surface type of the call is `R`; the synthesized local for the failure value is internal to this lowering. ## Imports The Go target's emitter is responsible for assembling the file's import block from the symbols referenced during rendering. A symbol carries the Go import path of the package it lives in and the unqualified identifier name. The emitter: - Aggregates the set of referenced import paths across every declaration written into a file. - Picks a default alias from the path's last segment (skipping a trailing `/vN` major-version suffix). - When two distinct paths share the same default alias, the emitter renames later occurrences by appending a numeric suffix until uniqueness is reached. - Writes a Go `import (...)` block listing every referenced path; unused references produce no import. - Rewrites identifier renderings as `alias.Name` for every external reference. The default alias of the local package is empty: symbols declared in the same generated file reference each other without qualification. Targets MUST express external references through this emitter mechanism. Inlining raw import paths into source bypasses collision handling and is forbidden. ## Determinism Generated files are deterministic with respect to the input modules and options. Constants appear in source order within each module file. Union and wrapper type declarations appear sorted by interface name in `masterbelt_unions.go`. Map literal entries appear in lowering order (first occurrence position, last-wins value). Import blocks are emitted in lexicographic path order. # TypeScript Code Generation Source: https://masterbelt.dev/spec-src/codegen/typescript.md # TypeScript Code Generation This document defines the TypeScript code generation target. ## Kind The target kind identifier is `typescript`. ## Options The TypeScript target reads the following options from its `options` mapping in the project configuration: - `storage: STRING` selects the master-data backend baked into the generated package. Optional; defaults to `memory`. Accepted values are `memory` (the in-memory executor that consumes records supplied through the `MasterData` constructor / `loadJson`) and `sql` (a SQL-backed executor that translates queries into SQL against a host-supplied `SQLiteDatabase` adapter — see [Master Data](#master-data)). A `storage` value other than `memory` or `sql` is a configuration error. Unknown options are silently ignored to leave room for future extension. ## File Set For a project whose lowered IR modules are `m1.mst`, `m2.mst`, ..., the TypeScript target produces: - One generated file per Masterbelt module. The file name is the module name with the `.mst` suffix replaced by `.ts`. Each file contains the TypeScript declarations corresponding to that module's constants. - A `masterbelt_masterdata.ts` file when the project declares at least one master. The file declares the `MasterData` class described in [Master Data](#master-data). The file is omitted when the project declares no master. - A `masterbelt_query.ts` file when the project declares at least one master. The file declares `MasterSource`, `FieldRef`, the `Predicate` / `Ordering` interfaces, the exported concrete predicate / ordering classes that back the inspectable AST, the `and` / `or` / `not` combinators, the generic field-handle classes (`OrderedField`, `BoolField`), the `QueryPlan` value class, and the `executePlan` executor used by every generated relation terminal; see [Master Data](#master-data). The file is omitted when the project declares no master. Native TypeScript union types remove the need for a separate sealed-interface file. Future generator-managed files added to this target MUST use the reserved `masterbelt_` file name prefix to avoid collisions with user-named modules. ## Module Declaration Each generated file is a TypeScript ES module. Public declarations carry the `export` keyword. ## Reachability The TypeScript target follows the default reachability policy defined in `codegen/model`: only constants reachable from at least one `pub`-declared constant are emitted. Identifier references resolve against the surviving set, so a `pub const A = helper` declaration keeps `helper` even though `helper` itself is not public. ## Visibility A Masterbelt constant declared with `pub` is emitted with the `export` keyword. A non-public constant is emitted without it. Identifier names are preserved verbatim because TypeScript's visibility is expressed by the keyword, not by the identifier's case. ## Constants Each Masterbelt const item maps to one TypeScript `const` declaration. Doc comment lines are emitted as `// ` comments immediately before the declaration in source order. ## Type Mapping | Masterbelt | TypeScript | |-------------------|-----------------------------------------------------------| | `null` | `null` | | `bool` | `boolean` | | `int`, `uint`, `int8`, `uint8`, `int16`, `uint16`, `int32`, `uint32`, `int64`, `uint64` | `number` (TypeScript's only numeric type; integers above 2^53 lose precision) | | `string` | `string` | | `list` | `readonly T'[]` where `T'` is the mapping of `T` | | `map` | `Readonly>` | | `map` | unsupported at this stage; reports a generation error | | `T1 \| T2 \| ...` | A native TypeScript union of each member's mapping | | `{f: T, ...}` | `{ readonly field: T'; ... }` structural object type | | `fn(p: T, ...): R`| Arrow type `(p: T', ...) => R'` (see Function Types below)| | `enum Name { ... }`| `[export] const enum Name { Variant = value, ... }` (see Enums below) | Type declarations are resolved before mapping; declared types do not appear in generated code. The TypeScript target restricts map key types to `string` for this iteration. Other key types do not have a clean object-literal representation in TypeScript and require a `ReadonlyMap`-shaped runtime; they will be added in a follow-up. ## Literal Mapping - `null` lowers to the literal `null`. - `true` and `false` lower to the literals `true` and `false`. - An integer literal lowers to its decoded value rendered in base 10. - A string literal lowers to a TypeScript double-quoted string with TypeScript escape sequences. - A list literal `[e1, ..., eN]` lowers to `[e1', ..., eN']`. - A map literal of type `map` lowers to an object literal `{ "k1": v1, ..., "kN": vN }`, with keys quoted and entries emitted in lowering order after last-wins deduplication. - An identifier reference lowers to the referent constant's identifier. - A product literal lowers to a TypeScript object literal `{ name: value, ... }`. Field names appear as bare identifiers and entries preserve the source order of the literal. ## Master Data A `master Foo { record { ... } static { ... } }` declaration follows the runtime model defined in [../masterdata/schema.md](../masterdata/schema.md#runtime-model). The TypeScript target emits per master: - `Record` — a TypeScript `type` alias that backs one row, following the regular [Product Types](#product-types) rules. - `Relation` — an exported TypeScript `class` that exposes the master's chainable query surface. The class is data-less: it carries a single `private readonly plan: QueryPlan<Record>` field with a default constructor that builds a fresh plan and an optional `plan?` constructor argument for copy-on-write chaining. Stage methods (`where`, `orderBy`, `thenBy`, `skip`, `take`) return a freshly-allocated `Relation` whose plan extends the receiver's plan; the receiver is never mutated. Terminals (`toArray`, `findBy`, `firstOrDefault`, `count`, `any`) take `(data: MasterData, signal: AbortSignal)` (plus primary-key arguments for `findBy`) and resolve records through the master's `Executor` obtained from `data.getExecutor()`. Every terminal carries the `asyncable`, `cancellable`, and `failable` effects per [../masterdata/schema.md](../masterdata/schema.md#runtime-model) — the TypeScript target renders the `async`/`Promise` wrap (asyncable) and the trailing `signal: AbortSignal` parameter (cancellable); failure surfaces through a thrown exception (failable). Each user-declared static method becomes an instance method on the class; it always takes `(data: MasterData, ...declared args, signal: AbortSignal)`. - Module-level entrypoint: `export const : Relation = new Relation();`. Users write `items.where(...).toArray(data, signal)` against the const directly. Both declarations live as siblings at module scope. The Masterbelt source identifier `` lowers to the module-level `` const named after the master in camelCase. Nested masters follow the same naming scheme on the flattened identifier — `master User { master Friendships { ... } }` emits `UserFriendshipsRecord`, `UserFriendshipsRelation`, and a module-level `userFriendships` const as siblings of the parent. See [Nested Masters](../masterdata/schema.md#nested-masters). ### MasterData Entry A project that declares at least one master emits one `MasterData` class declaration. The class lives in a generator-managed file named `masterbelt_masterdata.ts`. It declares: - One `private readonly Records: readonly Record[]` field per master (lowerCamelCase) storing the per-master record set. - A constructor `constructor(items: readonly Record[], users: readonly Record[], ...)` that takes one record array per master in master-declaration order and assigns each to the matching field. - One `getExecutor(): Executor<Record>` accessor per master. Every generated relation terminal calls `data.getExecutor()` to reach the active backend, then invokes `execute(plan, signal)` / `findByPK(plan, keys, signal)` on the returned executor. Under `storage: memory` the accessor returns a `MemoryExecutor` that wraps the master's record array (and the `matchesPK` closure when the master declares a primary key). The TypeScript target never writes `MasterData` or its construction in code emitted from a Masterbelt source program: those identifiers exist for the host application to construct and inject the dataset. MasterData stores records — not relations — because every generated relation is a data-less plan; the dataset is supplied at every terminal call site. When `storage: sql` is configured, the layout above changes: - `MasterData` carries a single `private readonly database: SQLiteDatabase` adapter shared by every per-master executor; the per-master record arrays are not emitted. - The constructor takes the `SQLiteDatabase` instead of record arrays. The host application wraps its preferred SQLite binding (`better-sqlite3`, `node:sqlite`, `bun:sqlite`, `sql.js`, ...) in the `SQLiteDatabase` interface (exported from `masterbelt_query.ts`) and passes it. - `getExecutor()` returns a per-master `SqlExecutor` (emitted alongside the relation in the master's module file) that translates the relation's `QueryPlan` into SQL via `translatePlan` / `translatePKLookup`, runs it through the adapter, and materialises rows with the generated `scanRow` function. - `loadJSON` is not emitted under `storage: sql`. Hosts that want to seed a SQL backend from the JSON exporter should round-trip the data through the SQLite exporter (see [../masterdata/export-sqlite.md](../masterdata/export-sqlite.md)) and open the resulting database. The `SQLiteDatabase` / `SQLiteRow` adapter interfaces, the `Executor` / `MemoryExecutor` seam, and the `translatePlan` / `translatePKLookup` helpers all live in `masterbelt_query.ts`. The generated core never imports a concrete SQLite binding. The host wraps its runtime's binding in the `SQLiteDatabase` interface. The supported example adapter wraps Node's built-in `node:sqlite` (Node 22+); other bindings (`better-sqlite3`, `bun:sqlite`, `sql.js`) follow the same shape: ```ts import { DatabaseSync } from "node:sqlite"; import type { SQLiteDatabase, SQLiteRow } from "./masterbelt_query"; export function nodeSqliteAdapter(db: DatabaseSync): SQLiteDatabase { return { all(sql, params): readonly SQLiteRow[] { return db.prepare(sql).all(...params) as SQLiteRow[]; }, get(sql, params): SQLiteRow | undefined { return db.prepare(sql).get(...params) as SQLiteRow | undefined; }, }; } // const data = new MasterData(nodeSqliteAdapter(new DatabaseSync("masterdata.db"))); ``` The host constructs the SQL-backed dataset with `new MasterData(adapter)` (the same constructor entry the memory backend uses, with the adapter in place of record arrays) and threads it into every terminal call as the first argument. The same file also declares `loadJSON(data: string | object): MasterData`, a helper that consumes the JSON document produced by the JSON exporter (see [../masterdata/export-json.md](../masterdata/export-json.md)) and returns a freshly wired `MasterData`. The helper uses the platform `JSON.parse` only and is independent of any backend library: ```ts export function loadJSON(data: string | object): MasterData { const raw = typeof data === "string" ? JSON.parse(data) : data; return new MasterData( (raw.items ?? []) as ItemsRecord[], (raw.userFriendships ?? []) as UserFriendshipsRecord[], // ... one per master in declaration order ); } ``` Each generated `Record` type alias keeps its field names verbatim from source so the structural cast above is sufficient: TypeScript's structural typing makes a plain object with the right field shape interchangeable with one produced by a typed constructor. No decorators or runtime metadata are emitted on the record type itself. ### Static Body Rewrites A user-declared static method's body is rewritten so the planner-side master references resolve against the module-level relation const values and the threaded dataset: - `Master.toList()` inside the owning master's own static body lowers to `this.toArray(data, signal)`. The receiver is the data-less relation value the static method was invoked on, so a caller that chained stages before invoking the static observes those stages. - `Master.X` (any user-declared constant or method) inside the same owning master's body lowers to `this.X(data, ...args, signal)`. - `OtherMaster.toList()` lowers to `.toArray(data, signal)` against the module-level const, automatically imported when the call appears in a different module from where the master is declared. - `OtherMaster.X` (any other cross-master reference) lowers to `.X(data, ...args, signal)`. Every relation method, including user-declared statics and constants, accepts `(data: MasterData, ...declared args, signal: AbortSignal)` uniformly so the call-site rewrite stays straightforward; methods that do not declare a body-level need for either argument still take both. ### Top-Level Dataset Threading A top-level function or constant that transitively reaches any master static member (constant or method, including the built-in `toList()`) acquires the dataset as an explicit parameter: it receives `data: MasterData` as its first positional parameter and `signal: AbortSignal` as a trailing parameter, and forwards both to every call that also requires them. The Masterbelt source program never writes the parameters. The dataset-threading parameter is added before any other parameters declared by the function. It composes with the effect-driven signature transforms (`cancellable` ensures `signal: AbortSignal` is appended, `asyncable` wraps the return in `Promise`) without further interaction. ### Query API Every master emits the chainable surface directly on `Relation`. The TypeScript target uses a callback style: `where`, `orderBy`, and `thenBy` accept a callback that receives a typed `Fields` builder and returns a predicate or ordering AST. See [../masterdata/query.md](../masterdata/query.md) for the cross-target contract. The runtime types live in the generator-managed `masterbelt_query.ts` file (one file per project, emitted whenever the project declares at least one master): - `MasterSource` — a class with a single `readonly name` field identifying a master by its source-level name. - `FieldRef` — a class with a single `readonly name` field identifying a record field by its source-level name. - `Predicate` — `interface { evaluate(record: R): boolean }`. Parametrised by the record type so a predicate built for one master cannot be passed to another's `where` callback; the structural check fails on `R`. - `Ordering` — `interface { compare(a: R, b: R): number }`. - Concrete predicate / ordering classes (`EqPredicate`, `NePredicate`, `LtPredicate`, `LePredicate`, `GtPredicate`, `GePredicate`, `InPredicate`, `BetweenPredicate`, `BoolEqPredicate`, `BoolNePredicate`, `BoolInPredicate`, `AndPredicate`, `OrPredicate`, `NotPredicate`, `AscOrdering`, `DescOrdering`) that carry the operator-relevant metadata (`field`, `value`, `low` / `high`, `operands`) as public readonly fields so a backend can translate the node to SQL without invoking the per-record accessor. - `and(...preds: readonly Predicate[]): Predicate`, `or(...): Predicate`, `not(p: Predicate): Predicate` — combinators over a single record type. A mixed-record composition is a compile-time error. - `OrderedField` and `BoolField` — generic field-handle classes whose constructors take `(name: string, accessor: (record: R) => V)` so each node embeds the source-level field name into its `FieldRef`. Their comparison methods return the concrete predicate / ordering classes specialised to the owning record type. - `QueryPlan` — the inspectable AST value class wrapping `source`, `predicates`, `orderings`, `skip`, and `take`. Stage helpers (`withWhere`, `withOrderBy`, `withThenBy`, `withSkip`, `withTake`) return a new plan; the original is never mutated. - `executePlan(records, plan)` — the shared in-memory evaluator used by every generated `Relation` terminal. - `stableSort` — internal helper used by `executePlan`. The per-master user-facing declarations are: - `Fields` — an exported TypeScript `class` exposing one typed field-handle property per supported record field. Field handles for ordered values (numeric, string) expose `eq`, `ne`, `lt`, `le`, `gt`, `ge`, `in`, `between`, `asc`, and `desc`; field handles for bool values expose `eq`, `ne`, and `in` only. Each handle's comparison method returns the concrete predicate / ordering class specialised to the owning master, statically typed as `Predicate<Record>` / `Ordering<Record>` for the callback's return type. - `Relation` — see [Master Data](#master-data) for the relation class itself. Its `where` / `orderBy` / `thenBy` callbacks are typed `(fields: Fields) => Predicate<Record>` / `Ordering<Record>`, so the compiler rejects a predicate built against a different master at the call site. ```ts import { and } from "./masterbelt_query"; import { items } from "./data"; const weapons = await items .where(item => and(item.category.eq("weapon"), item.level.ge(10))) .orderBy(item => item.sortOrder.asc()) .take(10) .toArray(data, signal); ``` ### Scope Methods Each `pub` [scope](../masterdata/schema.md#scope-section) on a master emits an instance method on the `Relation` class; a non-`pub` scope is internal to Masterbelt source and is not emitted. The method name is the source scope name verbatim in camelCase (`genderedAdult` stays `genderedAdult`). Its parameters are the scope's declared parameters mapped through the regular [Type Mapping](#type-mapping); it takes no `data` and no `signal` because a scope is effect-free and synchronous. ```ts (): Relation ``` The method returns a freshly-allocated `Relation` whose plan extends the receiver's plan with the scope body's stages (chained scopes inlined), exactly like the built-in stage methods, so a scope composes with `where` / `orderBy` and with other scopes (`records.adult().gendered(1)`). The method builds a plan only, so it is identical regardless of the configured storage backend. A call to a non-`pub` sibling scope is inlined into the method body (with its parameters substituted), because only `pub` scopes are emitted as methods; a call to a `pub` sibling scope is a method call. An `indexed` scope adds no TypeScript surface beyond its `pub` flag; it only influences SQLite index generation. ### Select Projections Each `select Name { ... }` section on a master ([../masterdata/schema.md](../masterdata/schema.md#select-section)) emits a parallel set of TypeScript declarations alongside the source relation: - `Record` — an exported `type` alias carrying the projected fields. Field order matches the source order written in the select body. - `Fields` — an exported `class` exposing typed field-handle properties for the projected record. - `Relation` — an exported `class` carrying the source relation by value plus its own per-projection plan. Stage methods are copy-on-write; terminals mirror the source relation's surface, parametrised on the projected record type. - `select(): Relation` — a method on `Relation` that returns a fresh projected relation capturing the receiver's source-side plan. Terminals on the projected relation first apply the source-side plan, then project each surviving record into a `Record` by copying the named fields, and finally apply the projected plan to the projected slice. Projected relations do not expose `findBy`. ```ts const summaries = await items .where(item => item.count.ge(10)) .selectSummary() .orderBy(summary => summary.name.asc()) .toArray(data, signal); ``` ### Join Operator Each `ref` field on a master's record ([../masterdata/relations.md](../masterdata/relations.md)) emits a parallel set of TypeScript declarations alongside the source relation: - `JoinPair` — an exported `type` alias `{ readonly left: Record; readonly right: Record; }` that aggregates the joined pair. - `JoinLeftFields` / `JoinRightFields` — exported `class`es exposing typed field handles for each side's record. Each handle's accessor reads `pair.left.` or `pair.right.` so predicates and orderings type-check against the pair. - `JoinFields` — an exported `class` whose `left` and `right` properties hold the per-side field-handle classes above. The pair relation's `where` / `orderBy` / `thenBy` callbacks receive an instance of this class. - `JoinRelation` — an exported `class` carrying the source relation by value, the right relation supplied at the call site, and its own pair-level plan. Stage and terminal methods mirror the source relation's surface, parametrised on the pair record. - `join(right: Relation): JoinRelation` — a method on `Relation` that returns a fresh joined relation capturing the receiver as the left source, the supplied relation as the right source, and a fresh pair-level plan. Terminals on the joined relation first call `toArray(data, signal)` on the source relation, then iterate the surviving left records and `await this.right.findBy(data, signal, leftRecord._, ...)` for each, pushing the pair on a successful match and dropping the row on `undefined` (INNER JOIN; `LEFT` / `RIGHT` / `FULL OUTER` deferred). Pair-level state (predicates, orderings, skip, take) then applies before the terminal returns. Joined relations do not expose `findBy`. ```ts const pairs = await b .joinARecord(a) .where(fields => fields.right.name.eq("alpha")) .orderBy(fields => fields.left.id.asc()) .toArray(data, signal); ``` ## Product Types A Masterbelt product type lowers to a TypeScript object type literal with `readonly` modifiers on every field: ```ts export type Item = { readonly name: string; readonly count: number; }; ``` Field order in the emitted type matches the field order written in source. A field's `readonly` modifier renders as the TypeScript `readonly` keyword on that field; fields without the modifier (or with the explicit `writable` modifier) emit no keyword and are assignable through the structural type. ## Unions A union type lowers to the native TypeScript union of each member's mapping. Member ordering matches the canonical order recorded in the IR, which is lexicographic by member type spelling. No wrapper types, marker methods, or generated files are introduced; a value of a union type lowers to the unwrapped TypeScript value because the TypeScript union directly accepts every member. For example, `bool | int` lowers to the type `boolean | number`, and a value `1` of that union lowers to the literal `1`. ## Function Types A Masterbelt function type lowers to a TypeScript arrow type `(name: T, ...) => R`. A type declaration whose body is a function type emits a TypeScript `type` alias following the rule in [Type Declarations](#type-declarations): ```ts export type BinaryOp = (left: number, right: number) => number; export type Mapper = (value: T) => U; export type Summer = (initial: number, ...values: readonly number[]) => number; ``` Parameter names are preserved from source. A variadic parameter prefixed with `*` in source lowers to a TypeScript rest parameter prefixed with `...`; the parameter type is rendered as a `readonly` array of the declared element type because TypeScript requires variadic parameters to be array-typed. Effects defined in [Effects](#effects) shape the rendered signature: - `cancellable` appends an `AbortSignal` parameter named `signal` to the parameter list. - `failable` does not change the declared return type. The TypeScript signature renders the success type `R` only; the failure path is plumbed internally by the call-site lowering (see [Failable Handling](#failable-handling)). When combined with `asyncable` the wrapping is `Promise`. - `asyncable` wraps the return type in `Promise` where `R` is the declared return type. A function type that combines multiple effects applies every transformation listed above. ## Enums A Masterbelt enum lowers to a TypeScript `const enum` declaration: ```ts export const enum Status { Active = 0, Inactive = 1, } ``` Variant names appear in source declaration order, each with its resolved integer value as the right-hand side. The `const enum` form keeps every member's compile-time value inlined at call sites so emitted code carries no runtime object for the enum. A member access expression `Enum.Variant` lowers to the literal `Enum.Variant` reference, which TypeScript resolves to the variant's integer literal at compile time. ## Functions and Methods A top-level function declaration emits an `[export] function name(params): Return { ... }`. When the function's signature carries `asyncable`, the declaration is `async` and the return type is wrapped in `Promise`. Other effects follow the rules in [Effects](#effects). Methods declared inside a product type are emitted as free-standing module functions rather than members of the type. The TypeScript target represents product types as `type` aliases (not classes), which do not carry methods; methods become functions named `OwnerType_method[Index](self, args)` where `Index` is a 1-based numeric suffix on overloaded methods (the first overload omits the suffix). The receiver value is passed as the first argument named `self`, matching the Masterbelt implicit-receiver keyword so `self.field` inside a method body maps directly onto the synthesized parameter. A call expression `target(args)` emits `target(args)`. A method call `value.method(args)` emits the free-standing form `OwnerType_method(value, args)` with the overload suffix when applicable. A function literal `fn(params): R { ... }` is reserved for a follow-up; the current target does not synthesize TypeScript arrow expressions at expression positions. ## For Statements A Masterbelt for statement lowers to a TypeScript `for` statement. The IR subject shape selects the form: - `list` subject — `for (const name of subject) { ... }`. A skipped binding (`_` in source) renders as `_` (a TypeScript identifier the compiler accepts as a regular local; the destructuring case below uses TypeScript's `_` convention too). - `map` subject — `for (const [k, v] of Object.entries(subject)) { ... }`. Either binding renders as `_` when skipped. Map keys are always strings in the TypeScript target (see [Type Mapping](#type-mapping)); `Object.entries` produces the matching `[string, V]` tuples. - `range(start, end)` subject — `for (let i = start; i < end; i++) { ... }`. A `_` binding synthesizes `__mbI`, used only to advance the counter. A `break` statement lowers to TypeScript's `break`; a `continue` statement lowers to `continue`. The lowered subject expression is evaluated exactly once (TypeScript's `for-of` and counted-for forms both have a single evaluation point). Iteration over `Master.toList()` lowers through the [Master Data](#master-data) rewrite: `await .toArray(data, signal)` against the module-level relation const for a cross-master subject and `await this.toArray(data, signal)` when the iteration appears inside the owning master's own static body. ## Master Static Members A master's static section ([../masterdata/schema.md](../masterdata/schema.md#static-section)) emits its members on the master's `Relation` class (see [Master Data](#master-data)): - A `static const Name: T = value` emits as an instance method `Name(data: MasterData, signal: AbortSignal): T { return value }` on the relation. The body has the same dataset-rewriting rules as a method body so a constant initializer that calls another master's static member resolves through `.X(data, signal)`. - A `static fn name(params): R { body }` emits as `name(data: MasterData, ...params, signal: AbortSignal): R { body }` on the relation; the body is rewritten per the [Static Body Rewrites](#static-body-rewrites) above. A `pub` modifier on a static member emits the method as public on the class; a non-public member emits as `private`. Visibility is independent from the master's own `pub` modifier. The source-level access `Items.X` inside any callable's body lowers per the [Master Data](#master-data) rewrite rules: the owner-self case resolves to `this.X(data, signal)` (with `(data, ...args, signal)` when called with user arguments), and the cross-master case resolves to `.X(data, signal)` (with `(data, ...args, signal)` when called with user arguments) against the module-level relation const. ## Match Statements A Masterbelt match statement lowers to a TypeScript `if`/`else if` chain whose subject is bound once to a fresh local. Each arm contributes one branch: - A **type pattern** whose type is a primitive emits a `typeof` check (`number`, `string`, `boolean`) or a `value === null` check for the `null` type. A type pattern whose type is a user-declared product type emits a generated type predicate call `is(value)`; the predicate is emitted into `masterbelt_runtime.ts` once per product type referenced by any match statement and the calling module imports it via the standard cross-module import machinery. - An **enum pattern** lowers to a strict equality check against the enum variant (`value === Status.Active`). - A **literal pattern** lowers to a strict equality check against the literal value. - A **product pattern** combines a `is` predicate call with strict-equality or sub-pattern checks for every named field. Field bindings inside the branch destructure the matched value (`const field = value.field`). - A **wildcard pattern** lowers to an unconditional `else` branch. Bindings introduced by a pattern are emitted as `const` declarations at the top of the branch body. When the subject expression is a plain identifier without an explicit pattern binding, the narrowed local shadows the original name by a `const name = value as NarrowedType;` declaration so that downstream uses see the narrowed type. The original identifier remains untouched outside the match. When the match is statically exhaustive, the chain ends with the spec's `else { value satisfies never; }` tail so TypeScript's type system records the exhaustiveness without inserting a runtime throw. A wildcard arm replaces this tail with the user's body. When a guarded arm appears, its `if` condition is the AND of the pattern check and the guard expression; the failing-guard path falls through to the next arm. A match statement is otherwise emitted at its source position inside the surrounding TypeScript function body and shares the function's local scope. The subject expression is evaluated exactly once. ## Operator Expressions The unary and binary operator expressions defined in [language/syntax.md](../language/syntax.md) reach the TypeScript target as method calls on the operand's type (see [language/builtins.md](../language/builtins.md)). When the receiver is a built-in primitive type (or an alias whose target resolves to a primitive), the TypeScript target emits the call as the corresponding native operator: - Numeric operands emit `+ - * / % == != < <= > >= & | ^ << >>` directly. Equality is rendered as strict `===` / `!==`. - `bool` operands emit `&& || === !==`; `and`/`or`/`xor` lower to `&& || !==` respectively. Unary `not` emits `!`. - `string` operands emit `+` for `add` and `===` / `!==` / `< <= > >=` for the comparisons. - Unary `plus`, `minus`, and `not` emit `+`, `-`, and `!`. A user product type that declares one of the operator method names continues to lower through the regular method call path: the call site emits the free-standing `OwnerType_method(receiver, args)` form (with the overload suffix when applicable). The native-operator rewrite applies only when the receiver's resolved type is a primitive. ### Built-in Field Accesses The fields defined in [language/builtins.md](../language/builtins.md) on built-in primitive and generic types lower to native TypeScript expressions: - `string.length` lowers to `[...receiver].length`. The spread iterates the string by codepoint (JavaScript's string iterator yields codepoints, not UTF-16 code units), so the resulting array length matches the spec's codepoint count. The naive `receiver.length` property is **not** used because JavaScript's string `length` returns the UTF-16 code unit count, which diverges from the spec for any non-BMP codepoint. - `list.size` lowers to the array `length` property. - `map.size` lowers to `Object.keys(receiver).length`. Maps are emitted as plain object literals at this stage, so there is no `Map`-style `size` getter to reuse. No runtime helper is emitted for these fields; the lowering inlines the access expression at the use site. ### Built-in Generic Operator Methods `list.add(other)` and `map.add(other)` defined in [language/builtins.md](../language/builtins.md) lower to a call into a runtime helper that the TypeScript target writes alongside the generated module files. The helper file is named `masterbelt_runtime.ts` (the reserved `masterbelt_` prefix as for any generator-managed file) and is emitted only when at least one generated module references a helper from it. The helper file declares the exported functions used at call sites: - `masterbeltListAdd(a: readonly T[], b: readonly T[]): readonly T[]` returns a fresh array containing the elements of `a` followed by the elements of `b`. - `masterbeltMapAdd(a: ReadonlyRecord, b: ReadonlyRecord): ReadonlyRecord` returns a fresh object containing every entry of `a` and every entry of `b`, with keys present in both taking the value from `b`. Call sites import the helper by relative path from the calling module (`./masterbelt_runtime`) using the emitter's normal cross-module import machinery, then emit `masterbeltListAdd(a, b)` or `masterbeltMapAdd(a, b)` as the call expression. ## Type Declarations Each Masterbelt type declaration emits one TypeScript `type` declaration: ```ts export type LocalName = MappedTargetType ``` Public declarations carry the `export` keyword; non-public declarations omit it. When the declaration carries type parameters, the emitted TypeScript declaration carries the same parameters as a TypeScript generic type parameter list (``). Parameters have no constraint at this stage. At a use site, a generic declaration is rendered with its type arguments: `LocalName`. ## Cross-Module References A reference to a symbol declared in another Masterbelt module emits the imported identifier name. The emitter adds the corresponding `import { ... } from "..."` declaration automatically; the module specifier is derived from the foreign module's canonical project-relative path by stripping the `.mst` suffix and computing the path relative to the importing file's directory, prefixed with `./` when the relative form would otherwise be unqualified. ## Re-exports A `pub { ForeignName as LocalName } from "./other.mst"` declaration emits one ES module re-export per declared module specifier: ```ts export { ForeignName as LocalName } from "./other" ``` Multiple re-exports targeting the same foreign module are coalesced into one `export { ... } from "..."` statement. When `ForeignName` and `LocalName` are equal the `as` clause is omitted. A re-export keeps its declaration's doc comment as `// ...` lines immediately before the statement. ## Effects Each effect defined in `codegen/model` maps to TypeScript as follows: - `cancellable` adds an `AbortSignal` parameter named `signal` as the last named parameter of the callable. - `failable` does not change the declared return type. The TypeScript signature renders the success type `R`; the failure path is plumbed internally by the call-site lowering (see [Failable Handling](#failable-handling)). When combined with `asyncable`, the result wraps as `Promise`. - `asyncable` wraps the result in `Promise` and the callable is declared `async`. A call site whose callee is asyncable is wrapped in `await`. Every effect is inferred along the call graph: the TypeScript target computes an effective effect set per callable by walking from the declared effects and propagating through every transitive call site to a fixed point. A function whose effective set differs from its declared set still renders with the inferred shape — a non-asyncable surface declaration whose body calls an asyncable function still becomes an `async function` returning `Promise`, and a non-cancellable declaration that calls a cancellable function still receives and forwards the `AbortSignal`. The target reports an explicit diagnostic when a program requires an effect that has no defined mapping rather than emitting code that silently drops the obligation. ## Failable Handling A `failable` function in TypeScript reports failure through a thrown exception. The Masterbelt surface treats `failable` as transparent (see [language/semantics.md](../language/semantics.md#failable-handling)); the TypeScript target uses native exception propagation so neither the function signature nor the call site need to widen: - `fail "message"` lowers to `throw new Error("message")`. - `fail value` where `value: Error` lowers to `throw new Error(value.message)`. - A call to a `failable` callable lowers to a plain call expression; a thrown `Error` propagates through the surrounding `failable` function because the surrounding function does not catch it. No `try`/`catch` is emitted at call sites. The platform `Error` constructor is sufficient; the target does not emit a separate Error type declaration. Match expressions cannot observe the Error path of a `failable` call subject because the surface type of the call is `R`. ## Imports The TypeScript target's emitter assembles the file's `import { ... } from '...'` block from the symbols referenced during rendering. A symbol carries the module specifier it lives in and the unqualified identifier name. The emitter: - Aggregates the set of referenced (module specifier, name) pairs. - When two distinct module specifiers export the same name, later occurrences are imported with `import { Name as Name2 } from '...'` so each local identifier is unique. - Writes one `import { ... } from '...'` statement per distinct module specifier, with bindings sorted lexicographically by their original name. - Rewrites identifier renderings as the chosen local name. Targets MUST express external references through this emitter mechanism. Inlining raw module specifiers into source bypasses collision handling and is forbidden. ## Determinism Generated files are deterministic with respect to the input modules. Constants appear in source order within each module file. Map literal entries appear in lowering order (first-occurrence position, last-wins value). Union members and import bindings are emitted in lexicographic order; import statements are emitted in lexicographic module-specifier order. # C# Code Generation Source: https://masterbelt.dev/spec-src/codegen/csharp.md # C# Code Generation This document defines the C# code generation target. ## Kind The target kind identifier is `csharp`. ## Options The C# target reads the following options from its `options` mapping in the project configuration: - `namespace: STRING` is the C# namespace used in every generated file. Required. The value is used verbatim as the file-scoped `namespace` directive and must be a valid C# qualified identifier (one or more identifier segments separated by `.`). - `class: STRING` is the name of the shared `public static partial class` every module contributes to. Optional; defaults to `Masterbelt`. The value must be a valid C# identifier (one segment, no `.`). - `storage: STRING` selects the master-data backend baked into the generated package. Optional; defaults to `memory`. Accepted values are `memory` (the in-memory executor that consumes records supplied through the `MasterData` constructor / `LoadJson`) and `sql` (a SQL-backed executor that translates queries into ADO.NET commands against a host-supplied `DbConnection` — see [Master Data](#master-data)). A missing or empty `namespace` is a configuration error. An invalid `class` value is a configuration error. A `storage` value other than `memory` or `sql` is a configuration error. The C# target does not consume any other options at this stage; unknown options are silently ignored. ## File Set For a project whose lowered IR modules are `m1.mst`, `m2.mst`, ..., the C# target produces: - One generated file per Masterbelt module. The file name is the module file stem in PascalCase followed by `.cs`. Each file declares one segment of the shared `public static partial class` (named by the `class` option, default `Masterbelt`) containing every reachable constant from that module, plus any product-type classes declared by the module as siblings at file scope. - A `MasterbeltUnions.cs` file when any constant has a union type. The file declares one sealed-abstract-class union per union type encountered across the project. - A `MasterbeltMasterData.cs` file when the project declares at least one master. The file declares the `MasterData` class described in [Master Data](#master-data). The file is omitted when the project declares no master. - A `MasterbeltQuery.cs` file when the project declares at least one master. The file declares `MasterSource`, `FieldRef`, the `IPredicate` / `IOrdering` interfaces, the exported concrete predicate / ordering node classes that back the inspectable AST, the static `Predicates` helper with `And` / `Or` / `Not`, the generic field-handle classes (`OrderedField`, `BoolField`), the `QueryPlan` value type, and the `QueryRuntime.Execute` helper used by every generated relation terminal; see [Master Data](#master-data). The file is omitted when the project declares no master. All files share the same C# namespace declared by the `namespace` option and live under the configured output root. ### Reserved file name prefix Files invented by the C# target itself, rather than derived from a Masterbelt source file name, use the reserved `Masterbelt` PascalCase prefix. New generator-managed files added to this target MUST use the same prefix. ## Shared Partial Class C# does not have free-standing top-level constants. Every Masterbelt module contributes its reachable constants and re-exports to one shared `public static partial class` declared in the configured namespace. The class name is taken from the `class` option (default `Masterbelt`); each generated module file repeats the same `public static partial class` declaration with that name and adds its members. The class is always `public static partial`. Its visibility does not depend on whether any of its members are public: it is the container, not a participating symbol. Because every module shares one class, every top-level identifier across the project must be unique under its C# mapping; collisions are reported as a generation diagnostic. Product-type classes emitted from a module are declared as siblings of the partial class, at file scope rather than as nested types of the shared class. ## Reachability The C# target follows the default reachability policy defined in `codegen/model`: only constants reachable from at least one `pub`-declared constant are emitted. Identifier references resolve against the surviving set, so a `pub const A = helper` declaration keeps `helper` even though `helper` itself is not public. ## Visibility A Masterbelt constant declared with `pub` is emitted with the `public` access modifier on its class member; a non-public constant is emitted with `internal`. C# identifier names are preserved verbatim because C# expresses visibility through modifiers, not through identifier case. ## Constants and Static Fields Each Masterbelt const item maps to one C# class member: - A `const` field is used when the constant's Masterbelt type is `bool`, `string`, or any built-in fixed-width numeric type and the lowered expression is a corresponding literal. C# `const` fields are compile-time constants and require literal initializers. Native-width numerics (`int`, `uint`) cannot be C# `const` because `nint`/`nuint` are disallowed there, so they always fall back to the next bullet. - A `static readonly` field is used for every other case: `null`-typed constants, list and map literals, union-typed constants, and references to other constants. Doc comment lines are emitted as `/// ` and `/// ` lines forming a single XML documentation comment block immediately before the member. ## Type Mapping | Masterbelt | C# | |-------------------|------------------------------------------------------------------------------------| | `null` | `object?` (the value `null` lowers to the C# literal `null`) | | `bool` | `bool` | | `int` / `uint` | `nint` / `nuint` (C#'s native-width integers) | | `int8` / `uint8` | `sbyte` / `byte` | | `int16` / `uint16`| `short` / `ushort` | | `int32` / `uint32`| `int` / `uint` | | `int64` / `uint64`| `long` / `ulong` | | `string` | `string` | | `list` | `IReadOnlyList` where `T'` is the C# mapping of `T` | | `map` | `IReadOnlyDictionary` where `K'` and `V'` are the C# mappings | | `T1 \| T2 \| ...` | A sealed abstract class generated into `MasterbeltUnions.cs` (see Unions below) | | `{f: T, ...}` | `public class Name { public required T' Field { get; init; } ... }` emitted at file scope next to the shared partial class; see Product Types below | | `fn(p: T, ...): R`| `public delegate R Name(T p, ...)` at file scope when reached as a named type; `System.Func<...>` when used inline. See Function Types below | | `enum Name { ... }`| `public enum Name : Storage { Variant = value, ... }` at file scope (see Enums below) | Type declarations are resolved before mapping; declared types do not appear in generated code. ## Literal Mapping - `null` lowers to the C# literal `null`. - `true` and `false` lower to the C# literals `true` and `false`. - An integer literal lowers to its decoded value rendered in base 10. - A string literal lowers to a C# double-quoted string with C# escape sequences. - A list literal `[e1, ..., eN]` lowers to a C# 12 collection expression `[e1', ..., eN']`. The surrounding declared list type fixes the runtime collection type, so the explicit element type is omitted to keep the output free of the `IDE0300: Collection initialization can be simplified` warning. An empty list literal lowers to `[]`. - A map literal of type `map` lowers to `new Dictionary { [k1] = v1, ..., [kN] = vN }`, with entries emitted in lowering order after last-wins deduplication. - An identifier reference lowers to the referent constant's identifier, qualified by its declaring class when necessary. - An integer literal lowers to its decoded value rendered in base 10, with the suffix C# requires to bind the literal to the resolved numeric type: `UL` for `uint64`, `L` for `int64`, and `U` for `uint32`. Narrower fixed-width types (`int8`/`uint8`/`int16`/`uint16`/`int32`) and the native widths (`int`/`uint`) emit the bare digits because implicit conversion or target inference covers them. - A product literal lowers to a C# target-typed object initializer `new() { Field = ..., ... }`. The surrounding declared type (a product-type declaration) supplies the runtime type, so the type name is omitted at the call site to keep the output free of the `IDE0090: 'new' expression can be simplified` warning. Initializers carry PascalCase property names and preserve the source order of the literal. ## Product Types A Masterbelt type declaration whose target is a product type emits a `public class` at file scope, alongside the shared partial class: ```csharp public class Item { public required string Name { get; init; } public required int Count { get; init; } } ``` Field names are converted to PascalCase to match the C# naming convention. Field order matches the field order written in source. The `required` modifier ensures every field must be supplied at construction. Each property's setter accessor reflects the Masterbelt field modifier: a `readonly` field emits `{ get; init; }` so it cannot be reassigned after object initialization, while a field without the modifier (or with the explicit `writable` modifier) emits `{ get; set; }` and remains assignable. Anonymous nested product types written inline in a Masterbelt declaration body are normalized to named declarations by the lowering pass (see [ir.md](../language/ir.md#anonymous-product-hoisting)), so the C# target emits each one as a sibling class with the synthesized name. ## Unions A union type lowers to a C# sealed abstract class. For a canonical union `T1 | T2 | ... | TN`: - The class type is named by concatenating the PascalCase member type names with `Or`. For primitive members, the names are `Null`, `Bool`, `Int`, `String`. The order matches the canonical order recorded in the IR. - The class has a private parameterless constructor so external code cannot construct arbitrary subclasses. - For each non-null member type, a nested `public sealed class` wrapper is declared that inherits from the abstract base. The wrapper exposes a `public required` property named `Value` of the member's mapped C# type. Inheritance lets `is`-pattern checks on a union variable identify the matched member (used by match statement lowering), and a nested class has access to the base class's private parameterless constructor so external assemblies still cannot construct arbitrary subclasses. - The `null` member does not get a wrapper. The C# `null` literal directly satisfies a nullable reference to the abstract base, so a union variable of static type `T1Or...?` holding `null` represents the null case. A value of a union type lowers as `new . { Value = }` for non-null members and `null` for null. At this stage, only unions of primitive types are supported. A union containing a generic member is a generation error. ## Type Declarations Each Masterbelt type declaration whose target is not a product type emits one C# file-scoped using-alias directive placed immediately after the regular `using` directives and before the `namespace` declaration: ```csharp using LocalName = MappedTargetType; ``` When the declaration carries type parameters, the using directive carries the same parameters as a generic using-alias parameter list (``). Generic using-aliases require C# 12 or later. C# does not distinguish public and non-public for file-scoped aliases; the directive is visible everywhere in the file. A type declaration whose target is a product type does not emit a using-alias directive; it emits a `public class Name { ... }` at file scope. See the Product Types section above for the class shape; the parameter list is the same parameter list as the declaration's. At a use site, a generic class is rendered with its type arguments: `LocalName`. ## Enums A Masterbelt enum lowers to a `public enum Name : Storage { ... }` declaration at file scope alongside the shared partial class: ```csharp public enum Status : sbyte { Active = 0, Inactive = 1, } ``` The storage clause uses the C# spelling defined in [Type Mapping](#type-mapping) (`int8` → `sbyte`, `int32` → `int`, and so on). Variant names appear in source declaration order with their resolved integer values. A member access expression `Enum.Variant` lowers to the C# member access `Enum.Variant`. When the variant is used as an initializer for a shared-class member, the variant is emitted with its enum-type qualifier. ## Functions and Methods A top-level function declaration emits a `[public|internal] static Result Name(params) { ... }` method on the shared partial class. Asyncable functions are declared `async` and return `Task`; cancellable functions append a `CancellationToken cancellationToken` parameter. A method declared inside a product type emits a `[public] Result Name(params) { ... }` instance method on the owning class. Overloaded methods rely on C#'s native overloading by signature — the emitted methods all share the same name; no numeric suffix is needed. The Masterbelt implicit-receiver keyword `self` is rewritten to C#'s `this` keyword inside method bodies so the surface form maps onto C#'s idiom. A call expression `target(args)` emits `Target(args)` (PascalCase identifier resolution for functions). A method call `value.method(args)` emits `value.Method(args)`. A function literal `fn(params): R { ... }` is reserved for a follow-up; the current target does not synthesize C# lambdas at expression positions. ## For Statements A Masterbelt for statement lowers to a C# `foreach` or `for` statement. The IR subject shape selects the form: - `list` subject — `foreach (var name in subject) { ... }`. A skipped binding (`_` in source) is rendered as the C# discard pattern `_`. - `map` subject — `foreach (var (k, v) in subject) { ... }`. Either binding renders as `_` when skipped. The C# target's map type (`IReadOnlyDictionary`) yields `KeyValuePair` entries; the foreach pattern deconstructs them via the language's built-in tuple deconstruction support on `KeyValuePair`. - `range(start, end)` subject — `for (var i = start; i < end; i++) { ... }`. A `_` binding synthesizes `__mbI` used only to advance the counter. A `break` statement lowers to C#'s `break`; a `continue` statement lowers to `continue`. The lowered subject expression is evaluated exactly once. Iteration over a master's `toList()` lowers through the [Master Data](#master-data) rewrite (`await this.ToList(data, cancellationToken)` inside the owning master's own static body, `await Masterbelt..ToList(data, cancellationToken)` for a cross-master subject) and walks the returned `IReadOnlyList`. The `await` placement comes from the regular asyncable inheritance machinery: the surrounding static method becomes `async Task` silently and the for-loop subject lifts into `await ...` at the C# call site. ## Master Data A `master Foo { record { ... } static { ... } }` declaration follows the runtime model defined in [../masterdata/schema.md](../masterdata/schema.md#runtime-model). The C# target emits per master: - `Record` — a C# class that backs one row. Field naming, modifiers, and constructor / property generation follow the regular [Product Types](#product-types) rules. - `Relation` — a C# `public sealed class` declared at file scope alongside the record. The class is data-less: it carries a single `private readonly QueryPlan<Record> plan` field plus a default constructor that initialises a fresh plan and an `internal` constructor that accepts a pre-built plan for copy-on-write chaining. Stage methods (`Where`, `OrderBy`, `ThenBy`, `Skip`, `Take`) return a freshly-allocated `Relation` whose plan extends the receiver's plan; the receiver is never mutated. Terminals (`ToList`, `AsAsyncEnumerable`, `FindBy`, `FirstOrDefault`, `Count`, `Any`) take `(MasterData data, CancellationToken cancellationToken)` (plus primary-key arguments for `FindBy`) and resolve records through the master's `IExecutor` obtained from `data.GetExecutor()`. Every terminal carries the `asyncable`, `cancellable`, and `failable` effects per [../masterdata/schema.md](../masterdata/schema.md#runtime-model) — the C# target renders the `async`/`Task` wrap (asyncable) and the trailing `CancellationToken cancellationToken` parameter (cancellable); failure surfaces through a thrown exception (failable). `AsAsyncEnumerable` is the streaming counterpart of `ToList`: it returns `IAsyncEnumerable<Record>` and uses `[EnumeratorCancellation]` on its `cancellationToken` parameter so cancellation flows through `await foreach`. Each user-declared static method becomes an instance method on the relation; it always takes `(MasterData data, ...declared args, CancellationToken cancellationToken)`. - Package-level entrypoint: `public static partial class Masterbelt { public static readonly Relation = new(); }`. Users write `Items.Where(...).ToList(data, cancellationToken)` after adding `using static .Masterbelt;` (or `Masterbelt.Items.Where(...)` with no using directive). Both declarations live as siblings at file scope. Nested masters follow the same naming scheme on the flattened identifier — `master User { master Friendships { ... } }` emits `UserFriendshipsRecord`, `UserFriendshipsRelation`, and a `Masterbelt.UserFriendships` static field as siblings of the parent. See [Nested Masters](../masterdata/schema.md#nested-masters). ### MasterData Entry A project that declares at least one master emits one `MasterData` class declaration. The class lives in a generator-managed file named `MasterbeltMasterData.cs`. It declares: - One `private readonly IReadOnlyList<Record> Records` field per master (lowerCamelCase), storing the per-master record set. - A constructor `public MasterData(IReadOnlyList<Record> items, IReadOnlyList<Record> users, ...)` that takes one read-only list per master in master-declaration order and assigns each to the matching field. - One `internal IExecutor<Record> GetExecutor()` accessor per master. Every generated relation terminal calls `data.GetExecutor()` to reach the active backend, then `await`s `Execute(plan, cancellationToken)` / `FindByPK(plan, keys, cancellationToken)` on the returned executor. Under `storage: memory` the accessor returns a `MemoryExecutor<Record>` wrapping the master's record list (and the `Relation.MatchesPK` closure when the master declares a primary key). The C# target never writes `MasterData` in code emitted from a Masterbelt source program: the identifier exists for the host application to construct and inject the dataset. MasterData stores records — not relations — because every generated relation is a data-less value-typed plan; the dataset is supplied at every terminal call site. When `storage: sql` is configured, the layout above changes: - `MasterData` carries a single `private readonly DbConnection connection` (from `System.Data.Common`) shared by every per-master executor; the per-master record lists are not emitted. - The public positional constructor is replaced by a factory `public static Task NewSqliteMasterData(DbConnection connection, CancellationToken cancellationToken)`. The host application opens the connection with its preferred SQLite provider (`Microsoft.Data.Sqlite`, `System.Data.SQLite`, a Unity binding, ...) and owns its open/close lifetime; `MasterData` never closes it. - `GetExecutor()` returns a per-master `SqlExecutor` (emitted alongside the relation) that translates the relation's `QueryPlan` into a parameterised `DbCommand` via `SqlTranslator.TranslatePlan` / `TranslatePKLookup`, runs it with `ExecuteReaderAsync`, and materialises rows with the generated `ScanRow`. - `LoadJson` is not emitted under `storage: sql`. Hosts that want to seed a SQL backend from the JSON exporter should round-trip the data through the SQLite exporter (see [../masterdata/export-sqlite.md](../masterdata/export-sqlite.md)) and open the resulting database. The `IExecutor` / `MemoryExecutor` seam, the `ISqlEmittable` / `ISqlOrderable` interfaces, the `SqlFragment` carrier, the `SqlTranslator` helpers, and the `SqlSupport` query runner all live in `MasterbeltQuery.cs`. Generated code depends only on `System.Data.Common`; it never references a concrete SQLite provider package. Under `storage: sql`, a `ref` join and a `select` projection are pushed into SQL when the source relation's plan is trivial: the join becomes one `INNER JOIN` statement (rather than a per-row `FindBy`), and the projection becomes a column-narrowed `SELECT`. Each derived relation captures the source plan's triviality at the accessor (`SqlTranslator.IsTrivial`); a non-trivial source plan (a source-side `Where`/`OrderBy`/`Skip`/`Take`) or the memory backend falls back to the materialise-and-evaluate path. Both paths return identical results. Because the factory accepts a `DbConnection` directly, no adapter is needed: a provider's connection type already derives from `DbConnection`. With `Microsoft.Data.Sqlite`: ```csharp using Microsoft.Data.Sqlite; await using var connection = new SqliteConnection("Data Source=masterdata.db"); await connection.OpenAsync(cancellationToken); var data = await MasterData.NewSqliteMasterData(connection, cancellationToken); // var items = await Masterbelt.Items.Where(f => f.Count.Ge(20)).ToList(data, cancellationToken); ``` The host owns the connection's open/close lifetime; `MasterData` never disposes it. Any `System.Data.Common`-compatible provider (`Microsoft.Data.Sqlite`, `System.Data.SQLite`, a Unity SQLite binding) works without changes to the generated code. The same file also declares `public static MasterData LoadJson(string json)`, a helper that consumes the JSON document produced by the JSON exporter (see [../masterdata/export-json.md](../masterdata/export-json.md)) and returns a freshly wired `MasterData`. The helper uses `System.Text.Json` only: ```csharp public static MasterData LoadJson(string json) { var raw = JsonSerializer.Deserialize(json, JsonOptions); return new MasterData(raw.items, raw.userFriendships /* ... */); } private static readonly JsonSerializerOptions JsonOptions = new JsonSerializerOptions { PropertyNameCaseInsensitive = true }; private sealed class JsonShape { [JsonPropertyName("items")] public IReadOnlyList items { get; init; } [JsonPropertyName("userFriendships")] public IReadOnlyList userFriendships { get; init; } // ... one per master in declaration order } ``` Each generated `Record` property carries a `[JsonPropertyName("")]` attribute so the C# `Id` property maps to JSON key `id`, `Name` maps to `name`, and so on. The surface name is the master's source-level field identifier verbatim. A `ref` field expands to the underlying primary-key fields under the surrounding field's name joined with `_` (`field_pk1`, `field_pk2`, ...); the `[JsonPropertyName(...)]` on each expanded leaf carries the joined source name. Other product types not declared inside a `master` block do not receive JSON attributes. ### Static Body Rewrites A user-declared static method's body is rewritten so the planner-side master references resolve against the package-level relation values and the threaded dataset: - `Master.toList()` inside the owning master's own static body lowers to `this.ToList(data, cancellationToken)`. The receiver is the data-less relation value the static method was invoked on, so a caller that chained stages before invoking the static observes those stages. - `Master.X` (any user-declared constant or method) inside the same owning master's body lowers to `this.X(data, cancellationToken, ...args)`. - `OtherMaster.toList()` lowers to `Masterbelt..ToList(data, cancellationToken)` against the static field on the partial class. - `OtherMaster.X` (any other cross-master reference) lowers to `Masterbelt..X(data, cancellationToken, ...args)` against the static field. Every relation method, including user-declared statics and constants, accepts `(MasterData data, ...declared args, CancellationToken cancellationToken)` uniformly so the call-site rewrite stays straightforward; methods that do not declare a body-level need for either argument still take both. ### Top-Level Dataset Threading A top-level function or constant that transitively reaches any master static member (constant or method, including the built-in `toList()`) acquires the dataset as an explicit parameter: it receives `MasterData data` as its first positional parameter and `CancellationToken cancellationToken` as a trailing parameter, and forwards both to every call that also requires them. The Masterbelt source program never writes the parameters. The dataset-threading parameter is added before any other parameters declared by the function. It composes with the effect-driven signature transforms (`cancellable` ensures `CancellationToken cancellationToken` is appended, `asyncable` wraps the return in `Task`) without further interaction. ### Query API Every master emits the chainable surface directly on `Relation`. The C# target uses a callback style: `Where`, `OrderBy`, and `ThenBy` accept a callback that receives a typed `Fields` instance and returns a predicate or ordering AST. Per the design doc, the predicate is a generated AST type and **not** an `Expression>` for this phase; future SQL-translating backends may revisit the choice. See [../masterdata/query.md](../masterdata/query.md) for the cross-target contract. The runtime types live in the generator-managed `MasterbeltQuery.cs` file (one file per project, emitted whenever the project declares at least one master): - `MasterSource` — a `public sealed record(string Name)` identifying a master by its source-level name. - `FieldRef` — a `public sealed record(string Name)` identifying a record field by its source-level name. - `IPredicate` — `interface { bool Evaluate(R record); }`. Parametrised by the record type so a predicate built for one master cannot be passed to another's `Where` callback; the compiler rejects the mismatch on `R`. - `IOrdering` — `interface { int Compare(R a, R b); }`. - Concrete predicate / ordering classes (`EqPredicate`, `NePredicate`, `LtPredicate`, `LePredicate`, `GtPredicate`, `GePredicate`, `InPredicate`, `BetweenPredicate`, `BoolEqPredicate`, `BoolNePredicate`, `BoolInPredicate`, `AndPredicate`, `OrPredicate`, `NotPredicate`, `AscOrdering`, `DescOrdering`) that carry the operator-relevant metadata (`Field`, `Value`, `Low` / `High`, `Operands`) as public properties so a backend can translate the node to SQL without invoking the per-record accessor. - `Predicates.And(params IPredicate[])`, `Predicates.Or(...)`, `Predicates.Not(IPredicate)` — static combinators that compose predicates over a single record type. A mixed-record composition is a compile-time error. - `OrderedField where V : IComparable` and `BoolField` — generic field-handle classes whose comparison methods return concrete predicate / ordering nodes specialised to the same record type. The constructors take `(string name, Func accessor)` so each node embeds the source-level field name into its `FieldRef`. - `QueryPlan` — the inspectable AST value type wrapping `Source`, `Predicates`, `Orderings`, `Skip`, and `Take`. Stage helpers (`WithWhere`, `WithOrderBy`, `WithThenBy`, `WithSkip`, `WithTake`) return a new plan; the original is never mutated. - `QueryRuntime.Execute(records, plan)` — the shared in-memory evaluator used by every generated `Relation` terminal. The per-master user-facing declarations are: - `Fields` — a `public sealed class` exposing one `public readonly` field-handle per supported record field plus a `public static readonly Fields Instance` singleton. Field handles for ordered values (numeric, string) expose `Eq`, `Ne`, `Lt`, `Le`, `Gt`, `Ge`, `In`, `Between`, `Asc`, and `Desc`; field handles for bool values expose `Eq`, `Ne`, and `In` only. Each handle's comparison method returns the concrete predicate / ordering class specialised to the owning master, statically typed as `IPredicate<Record>` / `IOrdering<Record>` for the callback's return type. - `Relation` — see [Master Data](#master-data) for the relation class itself. Its `Where` / `OrderBy` / `ThenBy` callbacks are typed `Func<Fields, IPredicate<Record>>` / `Func<Fields, IOrdering<Record>>`, so the compiler rejects a predicate built against a different master at the call site. ```csharp using static Example.Masters.Masterbelt; // ... var weapons = await Items .Where(item => Predicates.And(item.Category.Eq("weapon"), item.Level.Ge(10))) .OrderBy(item => item.SortOrder.Asc()) .Take(10) .ToList(data, cancellationToken); await foreach (var item in Items .Where(item => item.Category.Eq("weapon")) .AsAsyncEnumerable(data, cancellationToken)) { // ... } ``` ### Scope Methods Each `pub` [scope](../masterdata/schema.md#scope-section) on a master emits an instance method on the `Relation` class; a non-`pub` scope is internal to Masterbelt source and is not emitted. The method name is the source scope name in PascalCase (`genderedAdult` → `GenderedAdult`). Its parameters are the scope's declared parameters mapped through the regular [Type Mapping](#type-mapping); it takes no `MasterData` and no `CancellationToken` because a scope is effect-free and synchronous. ```csharp public Relation () ``` The method returns a freshly-allocated `Relation` whose plan extends the receiver's plan with the scope body's stages (chained scopes inlined), exactly like the built-in stage methods, so a scope composes with `Where` / `OrderBy` and with other scopes (`Records.Adult().Gendered(1)`). The method builds a plan only, so it behaves identically under `storage: memory` and `storage: sql`. A call to a non-`pub` sibling scope is inlined into the method body (with its parameters substituted), because only `pub` scopes are emitted as methods; a call to a `pub` sibling scope is a method call. An `indexed` scope adds no C# surface beyond its `pub` flag; it only influences SQLite index generation. ### Select Projections Each `select Name { ... }` section on a master ([../masterdata/schema.md](../masterdata/schema.md#select-section)) emits a parallel set of C# declarations alongside the source relation: - `Record` — a `public sealed class` carrying the projected fields. Field order matches the source order written in the select body. - `Fields` — a `public sealed class` exposing typed field handles for the projected record, with the same singleton `Instance` shape as `Fields`. - `Relation` — a `public sealed class` carrying the source relation by value plus its own pair-level `QueryPlan<Record>`. Stage methods are copy-on-write; terminals mirror the base relation's surface, parametrised on the projected record type. - `public Relation Select()` — a method on `Relation` that returns a fresh projected relation capturing the receiver's source-side plan. Terminals on the projected relation first apply the source-side plan, then project each surviving record into a `Record` by copying the named fields, and finally apply the projected plan to the projected slice. Projected relations do not expose `FindBy`. ```csharp var summaries = await Items .Where(item => item.Count.Ge(10)) .SelectSummary() .OrderBy(summary => summary.Name.Asc()) .ToList(data, cancellationToken); ``` ### Join Operator Each `ref` field on a master's record ([../masterdata/relations.md](../masterdata/relations.md)) emits a parallel set of C# declarations alongside the source relation: - `JoinPair` — a `public sealed class` with `public required Record Left { get; init; }` and `public required Record Right { get; init; }` properties that aggregate the joined pair. - `JoinLeftFields` / `JoinRightFields` — `public sealed class`es each exposing typed field handles for the corresponding side's record, plus a singleton `Instance` static field. Each handle's accessor reads `pair.Left.` or `pair.Right.`. - `JoinFields` — a `public sealed class` whose `Left` and `Right` instance fields point at the per-side singletons above, plus its own singleton `Instance` callback target. - `JoinRelation` — a `public sealed class` carrying the source relation by value, the right relation supplied at the call site, and its own pair-level plan. Stage and terminal methods mirror the source relation's surface, parametrised on the pair record. - `public JoinRelation Join(Relation right)` — a method on `Relation` that returns a fresh joined relation capturing the receiver as the left source, the supplied relation as the right source, and a fresh pair-level plan. Terminals on the joined relation first `await source.ToList(data, cancellationToken)`, then iterate the surviving left records and `await right.FindBy(data, cancellationToken, leftRecord._, ...)` for each, calling `pairs.Add(new JoinPair { Left = ..., Right = ... })` on a successful match and dropping the row on `null` (INNER JOIN; `LEFT` / `RIGHT` / `FULL OUTER` deferred). Pair-level state (predicates, orderings, skip, take) then applies through `QueryRuntime.Execute` before the terminal returns. Joined relations do not expose `FindBy`. ```csharp var pairs = await B .JoinARecord(A) .Where(fields => fields.Right.Name.Eq("alpha")) .OrderBy(fields => fields.Left.Id.Asc()) .ToList(data, cancellationToken); ``` ## Match Statements A Masterbelt match statement lowers to a C# `switch` statement on the lowered subject expression. The C# target relies on the language's pattern matching syntax to express every arm without synthesizing helper methods. - A **type pattern** `T as name` lowers to a C# type pattern `case T name:`. A union type pattern matches against the union's nested wrapper class (`case IntOrString.Int_ wrapper:` followed by `var name = wrapper.Value;` for the user binding). A type pattern against the union's null member lowers to `case null:`. - An **enum pattern** lowers to a value pattern `case Status.Active:`. - A **literal pattern** lowers to a constant pattern `case 1:`, `case "a":`, `case true:`, or `case null:`. - A **product pattern** lowers to a recursive pattern `case T { Field: pattern, ... } prefix:`. Field sub-patterns reuse the same pattern lowering rules. Short field bindings emit `var` patterns (`{ Field: var name }`) so the field value is captured directly. - A **wildcard pattern** lowers to `default:`. A `|`-separated alternative list emits one `case` label per alternative on the same arm body. When alternatives introduce a binding, the binding is rendered with C#'s `or` pattern combinator inside a single case label. Bindings introduced by a pattern are emitted as local variables declared by the pattern itself (C# pattern matching binds directly), so the arm body uses the bound names without extra declarations. When the subject expression is a plain identifier without an explicit pattern binding, the C# target renders the narrowed binding by introducing `var name = ;` at the top of the arm body. A guard is rendered as a `when` clause on the case label (`case T name when condition:`). A guarded arm whose guard evaluates to `false` is skipped per C#'s normal switch semantics; matching continues with the next case. A wildcard arm becomes the `default:` label. When the match is statically exhaustive without a wildcard, the C# target omits any `default:` label; C#'s compiler does not require one for switch statements (it only requires exhaustiveness on switch expressions). The Masterbelt checker has already proven exhaustiveness, so no fallback `throw` is synthesized. A match statement is otherwise emitted at its source position inside the surrounding C# method body and shares the method's local scope. The subject expression is evaluated exactly once at the head of the switch. ## Operator Expressions The unary and binary operator expressions defined in [language/syntax.md](../language/syntax.md) reach the C# target as method calls on the operand's type (see [language/builtins.md](../language/builtins.md)). When the receiver is a built-in primitive type (or an alias whose target resolves to a primitive), the C# target emits the call as the corresponding native operator: - Numeric operands emit `+ - * / % == != < <= > >= & | ^ << >>` directly. - `bool` operands emit `&& || == !=`; `and`/`or`/`xor` lower to `&& || !=` respectively. Unary `not` emits `!`. - `string` operands emit `+` for `add`, the `==` / `!=` operators for `eql`/`neq`, and `String.Compare` ordinal comparison results for `lt`/`lteq`/`gt`/`gteq`. - Unary `plus`, `minus`, and `not` emit `+`, `-`, and `!`. A user product type that declares one of the operator method names continues to lower through the regular method call path: the call site emits `receiver.Method(args)` (C# native overloading handles overload disambiguation). The native-operator rewrite applies only when the receiver's resolved type is a primitive. ### Built-in Field Accesses The fields defined in [language/builtins.md](../language/builtins.md) on built-in primitive and generic types lower to native C# expressions: - `string.length` lowers to a call into the runtime helper `MasterbeltStringLength(receiver)` written into `MasterbeltRuntime.cs`. The helper returns the number of Unicode codepoints in the string by iterating `string.EnumerateRunes()`, matching the spec's codepoint count. The naive `receiver.Length` property is **not** used because it returns the UTF-16 code unit count, which diverges from the spec for any non-BMP codepoint. - `list.size` lowers to `receiver.Count` (the `IReadOnlyList.Count` property). - `map.size` lowers to `receiver.Count` (the `IReadOnlyDictionary.Count` property). Only the `string.length` access pulls in the runtime file; the list and map size accesses inline directly. ### Built-in Generic Operator Methods `list.add(other)` and `map.add(other)` defined in [language/builtins.md](../language/builtins.md) lower to a call into a runtime helper that the C# target writes alongside the generated module files. The helper file is named `MasterbeltRuntime.cs` (the reserved `Masterbelt` PascalCase prefix used by any generator-managed file) and is emitted only when at least one generated module references a helper from it. The helper file contributes to the shared partial class declared in the configured namespace, so call sites reach the helpers as bare identifiers without any class qualifier: - `MasterbeltListAdd(IReadOnlyList a, IReadOnlyList b): IReadOnlyList` returns a fresh list containing the elements of `a` followed by the elements of `b`. - `MasterbeltMapAdd(IReadOnlyDictionary a, IReadOnlyDictionary b): IReadOnlyDictionary` returns a fresh dictionary containing every entry of `a` and every entry of `b`, with keys present in both taking the value from `b`. The `K` parameter is constrained `where K : notnull` to satisfy `Dictionary<,>`'s key constraint. Call sites for `list.add` emit `MasterbeltListAdd(a, b)`; call sites for `map.add` emit `MasterbeltMapAdd(a, b)`. C#'s type inference resolves the generic type parameters from the operand types at the call site. ## Function Types A Masterbelt function type lowers to one of two C# forms depending on where it appears. A type declaration whose body is a function type emits a `public delegate` at file scope alongside the shared partial class: ```csharp public delegate int BinaryOp(int left, int right); public delegate U Mapper(T value); public delegate int Summer(int initial, params int[] values); ``` Parameter names are preserved from source and rendered after the parameter type, matching C#'s `Type name` parameter form. A variadic parameter prefixed with `*` in source lowers to a C# `params T[]` parameter; the element type is the parameter's declared type. The variadic-must-be-last rule defined in [language/types.md](../language/types.md) is enforced before generation. When a function type appears inline as part of another type expression (a product field's type, a generic argument, a union member, and so on), it lowers to a closed `System.Func<...>` constructed by listing each parameter type followed by the return type. Variadic inline function types have no `Func<>` representation and are rejected with the diagnostic defined for unsupported types; named-delegate form must be used instead. Effects defined in [Effects](#effects) shape the rendered signature: - `cancellable` appends a `CancellationToken cancellationToken` parameter to the parameter list. - `failable` does not change the declared return type. The C# signature renders the success type `R`; the failure path is plumbed by exception propagation at the call site (see [Failable Handling](#failable-handling)). - `asyncable` wraps the declared return type `R` in `System.Threading.Tasks.Task`. When the named delegate or `Func<...>` form is used, the wrapping applies to the return type only; the C# `async` keyword is not synthesized at the type level (it belongs on the implementation, not the delegate signature). A function type that combines multiple effects applies every transformation listed above. ## Cross-Module References A reference to a symbol declared in another Masterbelt module emits the bare C# identifier of the foreign symbol. No class qualifier is added: every module's reachable symbols are members of one shared partial class, so the foreign symbol is already visible in scope at the reference site. ## Re-exports A `pub { ForeignName as LocalName } from "./other.mst"` declaration that renames the foreign symbol emits a forwarding member on the shared partial class. The forwarding member is a `public static readonly` field whose type matches the foreign symbol's checked type and whose initializer references the foreign symbol by its bare name: ```csharp public static readonly int LocalName = ForeignName; ``` A re-export that keeps the foreign name (no `as` rename) is a no-op in C#: the foreign symbol is already a member of the shared partial class under the same identifier, so emitting `public static readonly T Foo = Foo;` would be a self-reference. Such re-exports emit no declaration. Re-exports keep their declaration's doc comment as an XML documentation block immediately before the field. ## Effects Each effect defined in `codegen/model` maps to C# as follows: - `cancellable` adds a `CancellationToken cancellationToken` parameter as the last parameter of the callable. - `failable` does not change the return type. Failure is surfaced through a thrown exception (see [Failable Handling](#failable-handling)); call sites do not catch it so the exception bubbles through the surrounding callable. - `asyncable` wraps the result in `Task` (or `Task` for `void`) and the callable is declared `async`. A call site whose callee is asyncable is wrapped in `await`. Every effect is inferred along the call graph: the C# target computes an effective effect set per callable by walking from the declared effects and propagating through every transitive call site to a fixed point. A function whose effective set differs from its declared set still renders with the inferred shape — a non-asyncable surface declaration whose body calls an asyncable method still becomes an `async Task` method, and a non-cancellable declaration that calls a cancellable method still receives and forwards the `CancellationToken`. A callable that carries multiple effective effects combines all applicable transformations on its signature. The target reports an explicit diagnostic when a program requires an effect that has no defined mapping rather than emitting code that silently drops the obligation. ## Failable Handling A `failable` function in C# reports failure through a thrown exception. The Masterbelt surface treats `failable` as transparent (see [language/semantics.md](../language/semantics.md#failable-handling)); the C# target uses native exception propagation so neither the method signature nor the call site needs to surface the failure path: - `fail "message"` lowers to `throw new System.InvalidOperationException("message");`. - `fail value` where `value: Error` lowers to `throw new System.InvalidOperationException(value.Message);`. - A call to a `failable` callable lowers to a plain method invocation; a thrown exception propagates through the surrounding `failable` method because the surrounding method does not catch it. No `Error` class is emitted by the C# target. Match expressions cannot observe the failure path of a `failable` call subject because the surface type of the call is `R`. ## Using Directives The C# target's emitter assembles the file's `using` directives from the symbols referenced during rendering. A symbol carries the C# namespace it lives in and the unqualified identifier name. The emitter: - Aggregates the set of referenced namespaces across every declaration written into a file. - Emits one `using ;` directive per distinct referenced namespace, in lexicographic order, before the file's `namespace` directive. - When two distinct namespaces export the same unqualified name, the second occurrence is emitted as a `using` alias such as `using = .;` so each local identifier is unique. - Rewrites identifier renderings to use either the unqualified name (when only one namespace exports it) or the chosen alias. The default local name of an imported symbol is its original name in the namespace. The emitter only assigns aliases when collisions require it. Targets MUST express external references through this emitter mechanism. Inlining raw namespaces into source bypasses collision handling and is forbidden. ## Determinism Generated files are deterministic with respect to the input modules and options. Constants appear in source order within each class. Union and wrapper class declarations appear sorted by interface name in `MasterbeltUnions.cs`. Map literal entries appear in lowering order (first-occurrence position, last-wins value). Using directives are emitted in lexicographic order. # Configuration Source: https://masterbelt.dev/spec-src/tooling/configuration.md # Configuration Project configuration provides user workspace settings to tooling commands. - The project configuration loader is the single entry point for workspace configuration. - The loader may load multiple configuration sources over time. ## Project Configuration Files - Project-local configuration is read from YAML. - By default, tools look for `masterbelt.yml` and then `masterbelt.yaml` in the working directory. - When a configuration path is explicitly provided, tools read that file instead of the default paths. - Relative explicit configuration paths are resolved from the working directory. - If the explicitly provided configuration file does not exist or cannot be read, configuration loading fails. - The currently implemented project-local configuration schema is empty. - Unknown project-local configuration fields are invalid. - If project-local YAML cannot be parsed, configuration loading fails. ## Code Generation Targets - `targets: LIST` declares the code generation targets configured for the project. - An absent `targets` key, or an empty list, means no targets are configured. - Each list entry is a mapping with the following keys: - `kind: STRING` selects which generation behavior is used. The kind identifier is defined by each target's own specification. - `out: PATH` is the output root directory. Generated file paths are interpreted relative to this directory. Relative `out` paths are resolved from the project root. `out` is required. - `options: MAP` is a free-form mapping interpreted only by the target whose kind it carries. The shape of the options mapping is defined by each target's own specification. `options` is optional. - Multiple entries with the same `kind` are allowed; each is configured and runs independently. - A target entry with an empty or missing `kind` is invalid. - A target entry with an empty or missing `out` is invalid. ## Entrypoint - `entry: PATH` sets the project entrypoint file. - Relative entrypoint paths are resolved from the working directory. - Project configuration loading resolves the entrypoint path but does not validate its existence, file type, or extension. - Source graph construction validates that the resolved entrypoint is an existing Masterbelt source file. - Tools that build a Masterbelt source graph require an entrypoint file. - Tools that operate only on explicit source inputs, such as `fmt`, do not require an entrypoint file. ## Validators - `validators: MAP` configures the failure severity of master [validation](../masterdata/validation.md) rules. - The outer key is the **entrypoint-visible master path** — the master as it is visible from the entry module. A top-level master uses its declared name (for example `Records`); a nested master uses its dotted path (for example `User.Friendships`); an aliased re-export `pub { User as U }` uses the alias (for example `U.Friendships`). The flattened codegen name (for example `UserFriendships`) is never used here. - The inner key is a validator id declared in that master's `validation` section. - The value is the override severity, either `error` or `warning`; no other value is accepted. ```yaml validators: Records: nameRequired: warning valuePositive: error U.Friendships: uniquePair: warning ``` A failed `assert` defaults to `error` severity; a `validators` override can lower it to `warning`. An `error`-severity failure blocks the export and no artifact is written; a `warning`-severity failure is reported but does not block. Configuration is validated before validators run, so a typo is visible even when a master imported zero records: - A master path that matches no master is `masterbelt.validation.config_unknown_master`. - A validator id that matches no rule under a known master is `masterbelt.validation.config_unknown_validator`. - A severity outside `error` / `warning` is `masterbelt.validation.config_invalid_severity`. Each of these config diagnostics blocks the export. The `validators` mapping is parsed under the same strict YAML rules as the rest of project configuration, so unknown top-level keys are still rejected. ## Path-Specific Configuration - The currently implemented path-specific configuration source is EditorConfig. ### Discovery - Tools discover EditorConfig files by walking from the input file directory toward the filesystem root. - For standard input, tools walk from the current working directory. - Discovery stops after reading an EditorConfig file with `root = true`. ### Formatter Indentation Formatter indentation is resolved from the matching EditorConfig properties for the specific input or output path. - Configuration is path-specific. Settings for one file pattern, such as `*.mst`, must not be reused for another file pattern, such as `*.ts`. - If `indent_style = tab`, one formatter indentation level is one tab. - If `indent_style = space`, one formatter indentation level is `indent_size` spaces. - If `indent_size` is unset, `tab`, or not a positive decimal integer, the formatter default indentation is used. - If no matching EditorConfig indentation is found, the formatter default indentation is used. # CLI Source: https://masterbelt.dev/spec-src/tooling/cli.md # CLI The `masterbelt` command is the user-facing command line entry point. Commands must read project configuration before running behavior that depends on user project settings. Global options: - `-c PATH` and `--config PATH` explicitly select the project configuration file. - `--reporter FORMAT` selects the diagnostic reporter. - Supported formats are `text` and `json`. - `text` is the default and writes diagnostics to standard error. - `json` writes diagnostics to standard output by default. - JSON diagnostics can mix with command output when a command also writes primary output to standard output. - For machine-readable diagnostic output, use `json` with commands or modes that do not also emit primary output to standard output, such as `fmt --check` and `fmt --write`. - Reporter shorthand: - `--text` is shorthand for `--reporter text`. - `--json` is shorthand for `--reporter json`. - `--text` and `--json` are mutually exclusive. - Reporter shorthand options conflict with an explicit `--reporter` value that selects a different format. ### Diagnostic Output - Text diagnostics are written to standard error. - JSON diagnostics are written to standard output. ## `fmt` `masterbelt fmt` formats Masterbelt source files. ### Input - Formatter inputs must be valid Masterbelt source text, including the language UTF-8 source text requirement. - Invalid UTF-8 input fails formatting. - Empty input is considered already formatted. - When file paths are provided, the command formats each file independently. - When no file path is provided, the command reads one source file from standard input and writes formatted source to standard output. ### Modes - Without `--check` or `--write`, at most one file path may be provided. - Formatting multiple files to standard output is invalid because the output would not preserve file boundaries. - `--check` reports whether each input is already formatted. - `--check` applies to file inputs and to standard input. - `--check` must not write formatted source to files or standard output. - `--check` fails when any input would change. - `--write` updates file inputs in place. - `--write` cannot be used when reading from standard input. - `--check` and `--write` are mutually exclusive. - The default mode writes formatted source to standard output. - `--check` and `--write` suppress formatted source output. ### Configuration - The formatter indentation option is resolved from project configuration for each file input. - Standard input uses configuration discovered from the current working directory. ## `codegen` `masterbelt codegen` runs every code generation target configured in the project's `masterbelt.yml`. ### Input - The command reads the project configuration to discover the entrypoint and the configured target list. - An entrypoint is required (see `tooling/configuration`); the command fails when `entry:` is missing. - The command analyzes and lowers the entrypoint into the IR program model that targets consume. Future multi-file support follows once imports are implemented. ### Modes - The command runs every configured target by default. - The command does not read from standard input and does not write to standard output. Generated source is written to disk under each target's configured `out` directory. - Each invocation overwrites any existing file at the same output path; the command does not perform incremental updates and does not delete files left over from previous runs. ### Output - Output files are written into each target's `out` directory, creating parent directories as needed. - The text reporter is silent on success. - The JSON reporter writes an empty diagnostic envelope on success. ### Configuration - A project configured with no targets is a no-op: the command exits with code 0 and writes no files. - A target whose `kind` does not match any registered code generation target produces a diagnostic and the command fails. ## `export` `masterbelt export` imports the project's master data, validates it, and writes the configured export artifacts. ### Validation - Export imports each master's source records and applies each master's [filter](../masterdata/schema.md#filter-section), then runs each master's [validation](../masterdata/validation.md) rules over the post-filter records before writing any artifact. - An `error`-severity validation failure (a failed `assert` whose resolved severity is `error`) blocks the entire export: no file is written. A `warning`-severity failure is reported but does not block the export. - A validator-configuration error (`masterbelt.validation.config_unknown_master`, `masterbelt.validation.config_unknown_validator`, or `masterbelt.validation.config_invalid_severity`) is detected before validators run and likewise blocks the entire export. - An import error blocks the export as well; export proceeds to validation and writing only when import succeeds. ### SQLite Index Inference When the SQLite exporter is configured, a master's [`indexed scope`](../masterdata/schema.md#indexed-scopes) declarations drive secondary-index generation, defined in [export-sqlite.md](../masterdata/export-sqlite.md#secondary-indexes-from-indexed-scopes). Generating an index reports `masterbelt.scope.index_generated` (info) and a scope that cannot be fully inferred reports `masterbelt.scope.index_inference_failed` (warning); neither severity blocks the export. ## Exit Codes - 0: the requested operation succeeds. - 1: execution fails after command-line arguments are accepted, including diagnostics, unreadable inputs, unwritable outputs, invalid project configuration, or `fmt --check` detecting inputs that would change. - 2: command-line arguments are invalid. # Formatter Source: https://masterbelt.dev/spec-src/tooling/formatter.md # Formatter This document defines the currently implemented Masterbelt formatting behavior. The formatter is intentionally minimal at this stage. Future syntax additions must extend this document before or together with formatter changes. ## Options The formatter accepts an indentation string option. The indentation string is used for one indentation level inside grouped declarations. Callers should derive this value from the user's editor configuration when available. If no indentation string is provided, the formatter uses two spaces. ## Source Files The formatter parses source text and emits top-level declarations, statements, and comments in source order. Top-level items are separated by one blank line. The formatted output ends with one trailing line feed. ## Comments Line comments and block comments are preserved. Line comments and block comments that appear as top-level items are emitted on their own lines in source order. Consecutive top-level line comments and block comments are kept together without blank lines between them. A top-level line comment or block comment immediately followed by a declaration or statement is kept adjacent to that declaration or statement, with no blank line inserted between the comment and the following item. Line comments that appear after a declaration item or expression statement on the same source line are preserved as trailing line comments on the formatted line. Inside grouped const declarations, line comments and block comments are indented by one indentation level. ## Documentation Comments Documentation comments are emitted immediately before the declaration, statement, or grouped const item they document. The formatter preserves the documentation comment text after the leading `///`. Top-level documentation comments are not indented. Grouped const item documentation comments are indented by one indentation level. ## Const Declarations A single-item const declaration is formatted on one line: ```mst const A = 1 const A: int = 1 pub const A: bool = true ``` A const declaration with more than one item is formatted as a group: ```mst pub const ( A = 1 B: string = "x" ) ``` Grouped const items are indented by one indentation level. The outer visibility modifier applies to the group and is emitted before `const`. ## Expression Statements Expression statements are formatted as their expression. ## Master Validation Section The `validation` section of a master ([validation.md](../masterdata/validation.md)) is formatted as a brace-delimited section. When a master documents a canonical section order, the `validation` section follows `filter` and precedes `static`, matching the grammar order. - The `each` and `all` scope groups each indent one level deeper than the `validation` section. Consecutive groups are separated by exactly one blank line. - Within a group, each `validate { ... }` block indents one level deeper than the group. Consecutive `validate` blocks are separated by exactly one blank line. - An `assert ` statement is formatted like any other statement. Rule bodies reuse the existing statement and expression formatting. ```mst master Records { record { primary ID: int, Name: string, Value: int } validation { each { validate nameRequired { assert row.Name != "" } validate valuePositive { assert row.Value > 0 } } all { validate checkValueSum { let total = 0 for row in table { total = total + row.Value } assert total < 1000 } } } } ``` ## Master Scope Section A master [scope](../masterdata/schema.md#scope-section) declaration is formatted as `[pub ][indexed ]scope name(params) body` with canonical spacing. The two modifiers always render in `pub indexed scope` order regardless of their surface spelling. - A block-body scope renders its body one level deeper than the declaration, reusing the function-block statement and expression formatting; the `return` and any chained method calls follow the existing expression style. - An arrow-body scope renders `=> expression` on the declaration line. - Each scope declaration sits at the master-body indent and is separated from neighbouring sections by exactly one blank line, the same as the other sections. ```mst master Records { record { primary id: int, age: int, gender: int } scope adult() { return self.where(fn(row) => row.age.ge(20)) } pub scope genderedAdult(gender: int) { return self.adult().gendered(gender) } indexed scope youngest() => self.orderBy(fn(row) => row.age.asc()) } ``` ## Literals Null and boolean literals are emitted as `null`, `true`, and `false`. Integer literals preserve their parsed source spelling. String literals are emitted from their decoded value using the supported escape sequences. # Linter Source: https://masterbelt.dev/spec-src/tooling/linter.md # Linter This document will define lint behavior. # Language Server Protocol Source: https://masterbelt.dev/spec-src/tooling/lsp.md # Language Server Protocol This document will define LSP behavior. # Syntax Highlighting Source: https://masterbelt.dev/spec-src/tooling/highlighting.md # Syntax Highlighting This document describes the currently implemented Masterbelt syntax highlighting captures. Highlighting covers every keyword and the language's declarations, bindings, match patterns, type expressions, operators, literals, and comments. Future captures must extend this document before or together with query changes. ## Captures Keywords: - `visibility_modifier` is captured as `@keyword`. - The `const` keyword is captured as `@keyword`. - The `type` keyword is captured as `@keyword`. - The `use` keyword is captured as `@keyword`. - The `from` keyword is captured as `@keyword`. - The `as` keyword is captured as `@keyword`. - `field_modifier` (`readonly`, `writable`, `primary`) is captured as `@keyword.modifier`. - `effect_modifier` (`asyncable`, `failable`, `cancellable`) is captured as `@keyword.modifier`. - The `fn` keyword is captured as `@keyword`. - The `enum` keyword is captured as `@keyword`. - The `return` keyword is captured as `@keyword`. - The `master`, `record`, and `source` keywords are captured as `@keyword`. - The `filter`, `include`, and `exclude` keywords are captured as `@keyword`. - The `validation`, `each`, `all`, `validate`, and `assert` keywords are captured as `@keyword`. - The `static` and `select` keywords are captured as `@keyword`. - The `scope` context keyword is captured as `@keyword`; the `indexed` context keyword is captured as `@keyword.modifier`. - The `if`, `else`, `match`, `for`, `in`, `let`, and `fail` keywords are captured as `@keyword`, as are the `break` and `continue` statements. Definitions and identifiers: - Const item names are captured as `@constant`. - Type declaration names are captured as `@type.definition`. - Type parameter names on a generic type declaration are captured as `@type.parameter`. - Product type field names are captured as `@property`. - Product literal field names are captured as `@property`. - The type prefix of a typed product literal is captured as `@type`. - Function type parameter names are captured as `@variable.parameter`. - Enum declaration names are captured as `@type.definition`. - Enum variant names are captured as `@constant`. - In a member access expression `Target.Member`, the target identifier is captured as `@type` and the member identifier is captured as `@constant`. - Top-level function declaration names are captured as `@function`. - Method names declared inside a product type are captured as `@function.method`. - Master declaration names are captured as `@type.definition`. - The `validate` rule name of a master validation rule is captured as `@function`. - The scope name of a master scope declaration is captured as `@function`. - The projection name of a master select section is captured as `@type.definition`. - The source-kind identifier of a master source entry is captured as `@type`. - Master source option names are captured as `@property`. - Local `const` binding names are captured as `@constant`; `let` binding names, assignment targets, and `for` loop binding names are captured as `@variable`. Match patterns: - An enum pattern `Target.Variant` captures the target identifier as `@type` and the variant identifier as `@constant`. - A product pattern's type identifier is captured as `@type`, its field names as `@property`, and its `as` binding as `@variable`. - A type pattern's `as` binding is captured as `@variable`. - The wildcard pattern `_` is captured as `@variable.builtin`. Operators: - A unary or binary expression's operator token is captured as `@operator`. The capture uses the expression's `operator` field, so the `<` `>` of a generic type application and the `|` of a union type — which are the same tokens in type position — are not coloured as operators. - The `=>` arrow token and the variadic `*` prefix are captured as `@operator`. Type expressions: - `named_type` is captured as `@type`. - `reserved_type_identifier` is captured as `@type.builtin`. - The constructor identifier of `generic_type` is captured as `@type`. Literals and comments: - `null_literal` is captured as `@constant.builtin`. - `bool_literal` is captured as `@boolean`. - `integer_literal` is captured as `@number`. - `string_literal` is captured as `@string`. - `line_comment` is captured as `@comment`. - `block_comment` is captured as `@comment`. - `doc_comment` is captured as `@comment.documentation`. # Symbol Tags Source: https://masterbelt.dev/spec-src/tooling/tags.md # Symbol Tags This document describes the currently implemented Masterbelt symbol tagging behavior for tree-sitter tags. Symbol tagging is intentionally minimal at this stage. Future tag captures must extend this document before or together with query changes. ## Captures Const item names are captured as `@name`. Const items are captured as `@definition.constant`. Type declaration names are captured as `@name`. Type declarations are captured as `@definition.type`. Type parameter names on a generic type declaration are captured as `@name`. Type parameters are captured as `@definition.type.parameter`. Product type field names are captured as `@name`. Product type fields are captured as `@definition.field`. The `readonly`, `writable`, or `primary` modifier on a field, when present, is not part of the tagged span; it remains a syntactic modifier rather than a separately tagged declaration. Enum declaration names are captured as `@name`. Enum declarations are captured as `@definition.type`. Enum variant names are captured as `@name`. Enum variants are captured as `@definition.constant`. Function declaration names are captured as `@name`. Function declarations are captured as `@definition.function`. Method names declared inside a product type are captured as `@name`. Methods are captured as `@definition.method`. Master declaration names are captured as `@name`. Master declarations are captured as `@definition.type`. Fields and methods inside the master's record section are tagged through the existing product type field and method rules. Const and function declarations inside the master's `static` section are tagged through the existing const-item and function-declaration rules. Master select-section projection names are captured as `@name`. Select sections are captured as `@definition.type`. Master scope names are captured as `@name`. Scope declarations are captured as `@definition.method`, since a scope surfaces as a method on the master's relation. # Compatibility Source: https://masterbelt.dev/spec-src/compatibility.md # Compatibility This document will define compatibility, versioning, and deprecation policy. ## Reserved Keywords - The master [validation](masterdata/validation.md) feature reserves `validation`, `each`, `all`, `validate`, and `assert` as keywords. A program that used any of these as an identifier must rename it. The implicit validation bindings `row` and `table` are not reserved and remain usable as ordinary identifiers.