Skip to content

Add -fhoist-locals, -finitialize-locals#721

Open
tahina-pro wants to merge 9 commits intoFStarLang:masterfrom
tahina-pro:_taramana_hoist_locals
Open

Add -fhoist-locals, -finitialize-locals#721
tahina-pro wants to merge 9 commits intoFStarLang:masterfrom
tahina-pro:_taramana_hoist_locals

Conversation

@tahina-pro
Copy link
Copy Markdown
Member

tl;dr: This PR adds two passes, along with corresponding tests:

  • -fhoist-locals: hoist local variables to the beginning of functions
  • -finitialize-locals: add initializers to uninitialized local variables

You can find an example use of -fhoist-locals in my EverParse fork: tahina-pro/quackyducky@61a825f

Please do not rebase this branch to a commit that is not on everparse-fstar1, which is the Karamel branch (an ancestor of master) that EverParse master is currently using.

-fhoist-locals

This option hoists local variables to the beginning of functions:

  • Declarations already present at the beginning of functions are unchanged. Elsewhere:
  • Declarations without initializers are hoisted to the beginning of functions
  • Declarations with initializers are split into two parts: a declaration without initializer hoisted at the beginning of functions, and initializers become assignments:
    • local arrays with non-constant size, or with per-element initializers make hoisting stop: Karamel emits a warning, by default as an error
    • local arrays with a single initializer (EBufCreate not Any) are split into a declaration without initializer and a EBufFill operation

-finitialize-locals

This option adds initializers to uninitialized local variables.

It takes an argument:

  • no: do not initialize uninitialized local variables. This is the default.
  • c23: initialize uninitialized local variables with = {}. This is enough to initialize everything without descending into nested structs, etc.
  • c99: deep-initialize structs, using a field designator (.field =) to initialize each struct field. This option descends into nested structs, and initializes each integer with 0 and each pointer with NULL.
  • c89: same as c99, but without the field designator syntax

c99 and c89 initialize unions by initializing their first field, which is the only possible way to initialize a union in C89.

This option emits a warning, by default an error, if it encounters an unknown (abstract) type.

Other than for integer fields, this option does not rely on the old GCC = {0} behavior, which has been removed in recent (>= 14.1) versions of GCC.

tahina-pro and others added 8 commits April 2, 2026 19:36
New pass HoistLocals.ml, controlled by -fhoist-locals, hoists all local
variable declarations to the beginning of the function body:

- Declarations in the "declaration zone" (before any real statements)
  keep their initializers.
- Declarations after real statements get separated: the declaration
  (with EAny init) goes to the top, and an assignment replaces it at
  the original location.
- Buffer creations (EBufCreate/EBufCreateL) are left in place since
  later passes require them under their let-bindings.
- VLA (non-constant size stack arrays) that would need to cross real
  statements trigger warning 29 (fatal by default) and are left in
  place.

The pass runs before mark_possibly_unused in the pipeline.

Test suite in test/hoist-locals/ with 5 test modules:
- HoistInt: machine integer locals
- HoistArray: constant-size array locals
- HoistScope: local scopes with if/else
- HoistVlaNoWarn: VLAs already at the top (no warning)
- HoistVlaWarn: VLAs after statements (exercises warning 29)

Each test runs with and without -fhoist-locals and verifies identical
output.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Cherry-picked from 40c7257 (without tests).

Rewrite -finitialize-locals to take an argument: no (default), c23,
c99, or c89 (case-insensitive).

- c23: uses = {} (C23 empty initializer)
- c99: uses designated initializers
- c89: uses positional initializers

Warning 30 (InitializerUnknownType) for unknown types, fatal by default.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Test suite in test/initialize-locals/ with standalone Makefile.
InitLocals.fst tests scalar, shadowed, branching, boolean, and struct
locals. Each test is compiled and executed with c23, c99, and c89 modes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
In the -fhoist-locals phase, comments in the declaration zone are now
hoisted alongside local variable declarations so that they maintain
their position relative to initialised declarations.  Previously, a
standalone comment could be swapped past a declaration with an
initializer because comments stayed in the residual body while
declarations were collected into the hoisted list.

Additionally, recurse into EIfThenElse, ESwitch, EWhile, EFor, and
EMatch bodies to hoist local variable declarations from inside those
blocks up to the function level, replacing each initialiser with an
assignment at the original location.

Buffer creations (EBufCreate) with constant size are split: the
declaration is hoisted with EAny as the initializer, and an EBufFill
is emitted at the original location (CStarToC11 renders it as memset
or a for-loop).  Likewise, EBufCreateL is split into a declaration
with all-EAny elements and individual EBufWrite statements at the
original location.  Buffer creations with non-constant size (VLA)
emit warning 29 and are left in place.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When -fhoist-locals is enabled, local variable declarations are hoisted
to the function top. If a function contains #if/#else branches (from
IfDef-flagged externals), some hoisted variables may only be used in one
branch, triggering -Wunused-variable in the other.

Add a C11-level pass (MarkMaybeUnused) that detects variables not used
in all branches of an IfDef and inserts KRML_MAYBE_UNUSED_VAR() calls
after the declarations. This uses the existing KRML_MAYBE_UNUSED_VAR
macro which is available in both minimal and full krmllib headers.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Instead of inserting KRML_MAYBE_UNUSED_VAR calls outside the IfDef
(between declarations and the #if), insert them at the top of each
branch for the variables that branch does not reference.

Before:
  uint32_t a; uint32_t b;
  KRML_MAYBE_UNUSED_VAR(a);
  KRML_MAYBE_UNUSED_VAR(b);
  #if USE_FEATURE
    a = ...; return a;
  #else
    b = ...; return b;
  #endif

After:
  uint32_t a; uint32_t b;
  #if USE_FEATURE
    KRML_MAYBE_UNUSED_VAR(b);
    a = ...; return a;
  #else
    KRML_MAYBE_UNUSED_VAR(a);
    b = ...; return b;
  #endif

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@tahina-pro tahina-pro requested review from nikswamy and protz April 6, 2026 15:01
@protz
Copy link
Copy Markdown
Collaborator

protz commented Apr 6, 2026

Thanks Tahina, I will take a look later this week.

Please clarify how much AI was used for this PR to help me adjust my level of scrutiny. Thanks.

@tahina-pro
Copy link
Copy Markdown
Member Author

Sorry for my late reply, I am just back from vacation.
This PR is AI co-authored (the commit message is tagged as such).
I have been testing this PR with EverParse and it seems to work well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants