This guide shows how to write tests for Strata dialects in the StrataTest/
directory. It focuses on Laurel (the path that received the most recent
ergonomic work) and notes where Boole uses the same primitives. For Python
test wiring, see StrataTest/Languages/Python/README.md.
Quick rule of thumb: put the dialect program inside a
#stratablock and calltestLaurel(ortestLaurelResolution) on it. If the program is expected to fail, annotate the expected diagnostics inline with// ^^^ kind: message. The helper auto-detects whether to expect a clean run or to match annotations.
#strata <dialect>; … #endis a term-elaboration block that parses the inner Strata program at Lean compile time and elaborates to aStrata.SourcedProgram(the parsedProgramplus the snippet text and the byte/line offset of the snippet inside the surrounding.leanfile). Parse errors surface as Lean errors, so the LSP highlights them inside the block while you edit.- A
Coe SourcedProgram Programinstance lets the result of#strataflow into APIs that expect a plainProgram(e.g.Strata.Boole.verify); when consumers needProgramdirectly, just write.program. #guard_msgsis Lean's built-in golden-test command. Used here for positive tests — wrap an#evalin#guard_msgs inand pin the expected output via a/-- info: … -/docstring directly above. Negative tests use inline annotations instead (see recipe 2).
Diagnostic positions in inline annotations are snippet-local: lines are
1-indexed within the #strata block, columns are 0-indexed. The helper
converts file-global pipeline diagnostics back to snippet-local positions
before matching, so annotations don't drift when the surrounding .lean file
is edited.
Use testLaurel. The helper throws if any diagnostics fire, so no
#guard_msgs / docstring is needed:
import StrataTest.Util.TestLaurel
open StrataTest.Util
#eval testLaurel
#strata
program Laurel;
procedure foo() opaque { assert true };
#endSame helper, testLaurel. Annotate each expected diagnostic inline, on the
line directly below the offending source line, with carets pointing at the
column range and the kind + (substring of the) message after the carets:
#eval testLaurel
#strata
program Laurel;
procedure unsafeDivision(x: int)
opaque
{
var z: int := 10 / x
//^^^^^^^^^^^^^^^^^^^^ error: assertion does not hold
};
#endtestLaurel auto-detects: when annotations are present, it requires an
exact match on everything except the message text — every diagnostic
annotated, every annotation fires, positions (line + column range) agree,
kind matches — and the annotation's message must appear as a substring
of the actual diagnostic message. When no annotations are present, it expects
no diagnostics. A mismatch throws with a precise summary of which annotations
went unmatched and which actual diagnostics had no annotation, so
#guard_msgs isn't needed for negative tests.
The message is matched as a substring (not a prefix, and not exact) on purpose: real diagnostic messages often carry volatile detail — variable names with unique-id suffixes, full type descriptions, fix suggestions — that you don't want to pin verbatim. Annotate the stable, identifying fragment of the message and the test stays robust as messages are reworded.
kind is one of error, warning, not-yet-implemented, strata-bug.
When the test is about resolution (kind mismatches, duplicate names, scope
errors), running the verifier on a deliberately-broken program adds
unrelated noise (dbg_trace warnings, vacuous VCs). Use
testLaurelResolution — same auto-detect behavior as testLaurel, but
stops after resolve.
The gap between the two is not just the SMT call: testLaurel runs the full
compilation pipeline, which after resolve applies roughly a dozen
Laurel-to-Laurel lowering passes (type-alias elimination, heap
parameterization, type-hierarchy and modifies-clause transforms, hole-type
inference and elimination, short-circuit desugaring, imperative-expression
lifting, constrained-type elimination, …) before translating to Core and
verifying. Several of those passes re-run resolve on their output. So a
diagnostic that only testLaurel surfaces may originate from a later pass,
not the verifier. The major passes are described in the Translation
Pipeline section of the Laurel language manual — published at
strata-org.github.io/Strata,
source in docs/verso/LaurelDoc.lean — and the full
pass list and exact ordering live in
Strata/Languages/Laurel/LaurelCompilationPipeline.lean
(laurelPipeline).
#eval testLaurelResolution
#strata
program Laurel;
procedure foo() opaque {
var x: int := 1;
var y: x := 2
// ^ error: 'x' resolves to variable, but expected composite type, ...
};
#end/-! Shadowing in nested blocks is OK -/
#eval testLaurelResolution
#strata
program Laurel;
procedure foo() opaque {
var x: int := 1;
{ var x: int := 2 }
};
#endFor tests that exercise a specific transform (e.g. eliminateHoles,
liftExpressionAssignments, constrainedTypeElim), build a small per-file
helper that takes a parsed Strata.Program and runs the stages you care about.
translateLaurel does the parse → translate step so you don't have to repeat
it.
import StrataTest.Util.TestLaurel
import Strata.Languages.Laurel.InferHoleTypes
import Strata.Languages.Laurel.EliminateHoles
open Strata
open StrataTest.Util
namespace Strata.Laurel
private def parseElimAndPrint (program : Strata.Program) : IO Unit := do
let laurelProgram ← translateLaurel program
let result := resolve laurelProgram
let (laurelProgram, _, _) := inferHoleTypes result.model result.program
let (laurelProgram, _) := eliminateHoles laurelProgram
for proc in laurelProgram.staticProcedures do
IO.println (toString (Std.Format.pretty (Std.ToFormat.format proc)))
/--
info: function $hole_0()
returns ($result: int)
opaque;
procedure test()
{ var x: int := 1 + $hole_0() };
-/
#guard_msgs in
#eval parseElimAndPrint
#strata
program Laurel;
procedure test() { var x: int := 1 + <?> };
#end
end Strata.LaurelThe pattern: translateLaurel : Strata.Program → IO Laurel.Program is the
common entry point; everything after is dialect/test-specific.
The rule here is about independence, not about positive-vs-negative. Every negative test still contains plenty of correct code — the procedure that does verify so the broken one can be reached, the declarations the failing statement refers to, and so on. That mix inside a single block is expected and fine.
What matters is the boundary between blocks: give each independently
checkable unit its own #strata block, so a regression in one can't mask a
regression in another and a failure points at the smallest reproducing
snippet.
/-! ### Safe paths verify cleanly -/
#eval testLaurel #strata
program Laurel;
procedure safeDivision() opaque {
var z: int := 10 / 2;
assert z == 5
};
#end
/-! ### Unsafe division: divisor not constrained, fails verification -/
#eval testLaurel #strata
program Laurel;
procedure unsafeDivision(x: int) opaque {
var z: int := 10 / x
//^^^^^^^^^^^^^^^^^^^^ error: assertion does not hold
};
#endThe exception — and the more important point: when the units genuinely
depend on each other (one procedure calls another, or they share a type or
declaration), they must live in the same #strata block, because
resolution needs them together. In that case, list the union of expected
diagnostics across all the contained procedures in one block.
- Write the
#stratablock first, prefaced with#eval testLaurel. - For negative tests, sketch placeholder annotations like
// ^ error:below the offending lines — column positions don't have to be right yet. - Run it:
lake env lean StrataTest/Languages/Laurel/<your_file>.lean. - On failure: the helper prints exactly which annotations went unmatched and which diagnostics had no annotation, including the line/column range actually produced. Copy those into your annotations, save, re-run.
When lake env lean exits silently with no output, every assertion in the
file held.
There is no auto-update ("bless") mode yet that rewrites the inline annotations to match the actual diagnostics. For now step 4 is a manual copy from the failure report; adding a bless mode is a natural follow-up.
- Boole uses
#strata program Boole; … #enddirectly with hand-written helpers (e.g.Strata.Boole.verify); seeStrataTest/Languages/Boole/find_max.leanfor the canonical pattern.#stratais dialect-agnostic — Boole could use inline annotations the same way once a Boole-side helper analogous totestLaurelis written. - Python has its own pipeline-driven harness; see
StrataTest/Languages/Python/README.md.
| Concept | File |
|---|---|
#strata elaborator + SourcedProgram |
Strata/DDM/Integration/Lean/HashCommands.lean |
Diagnostic data type |
Strata/Languages/Core/Verifier.lean |
| Laurel test helpers | StrataTest/Util/TestLaurel.lean |
| Examples | StrataTest/Languages/Laurel/Examples/**/*.lean |