.vas pseudocode -> VAS -> x86-64 NASM assembly (.asm) -> nasm + ld / gcc -> executable
VAS (Virtual Assembler) is a lightweight text-replacement translator. It reads pseudo-instructions that use virtual registers (v0-v12) and outputs standard x86-64 NASM assembly.
It does not perform register allocation, instruction scheduling, or linking. Its sole purpose is to turn educational/prototype pseudo-code into NASM-assemblable source.
; hello.vas -- print "hello world" via Linux write syscall
default rel
section .data
msg db 'Hello, World from VAS!', 10
msglen equ $ - msg
section .text
global _start
_start:
MOVI v5, 1 ; rdi = 1 (stdout)
LEA v4, [msg] ; rsi = msg address
MOVI v3, msglen ; rdx = message length
MOVI v0, 1 ; rax = 1 (sys_write)
SYSCALL
MOVI v5, 0 ; rdi = exit code 0
MOVI v0, 60 ; rax = 60 (sys_exit)
SYSCALLvas -o hello.asm hello.vasGenerated hello.asm:
; hello.vas -- print "hello world" via Linux write syscall
default rel
section .data
msg db 'Hello, World from VAS!', 10
msglen equ $ - msg
section .text
global _start
_start:
mov rdi, 1
lea rsi, [msg]
mov rdx, msglen
mov rax, 1
syscall
mov rdi, 0
mov rax, 60
syscallChoose your platform:
Linux / WSL (nasm + ld):
nasm -f elf64 -o hello.o hello.asm
ld -o hello hello.o
./helloWindows (nasm + ld):
vas -target win64 hello.vas -o hello.asm
nasm -f win64 -o hello.obj hello.asm
ld -e main -o hello.exe hello.obj
hello.exeTip: If your code doesn't define its own
section .text, VAS automatically wraps it in a runnable skeleton. See Standalone Mode for details.
| Virtual Register | Physical Register |
|---|---|
| v0 | rax |
| v1 | rbx |
| v2 | rcx |
| v3 | rdx |
| v4 | rsi |
| v5 | rdi |
| v6 | r8 |
| v7 | r9 |
| v8 | r11 |
| v9 | r12 |
| v10 | r13 |
| v11 | r14 |
| v12 | r15 |
Virtual registers can be used in any operand position, including memory addressing (e.g. [v0+8], [v1+v2*8]), and are automatically replaced during translation.
For example:
; VAS input
ADD v1, [v0+8], v2
; NASM output
mov rbx, [rax+8]
add rbx, r12| Pseudo-instruction | Operands | Expansion | Notes |
|---|---|---|---|
ADD |
dst, src1, src2 |
mov dst, src1 then add dst, src2 |
Three-operand addition |
ADD |
dst, src |
add dst, src |
Two-operand addition |
SUB |
dst, src1, src2 |
mov dst, src1 then sub dst, src2 |
Three-operand subtraction |
SUB |
dst, src |
sub dst, src |
Two-operand subtraction |
MUL |
dst, src1, src2 |
mov dst, src1 then imul dst, src2 |
Three-operand multiplication |
MUL |
dst, src |
imul dst, src |
Two-operand multiplication |
| Pseudo-instruction | Operands | Expansion | Notes |
|---|---|---|---|
MOVI |
dst, imm |
mov dst, imm |
Load immediate |
MOV |
dst, src |
mov dst, src |
Register-to-register move |
LOAD |
dst, [addr] |
mov dst, [addr] |
Load from memory |
STORE |
src, [addr] |
mov [addr], src |
Store to memory |
LEA |
dst, [addr] |
lea dst, [addr] |
Load effective address |
Address expressions (e.g. [v1], [v0+8], [label]) pass through with only virtual register substitution.
| Pseudo-instruction | Operands | Expansion |
|---|---|---|
CMP |
a, b |
cmp a, b |
JMP |
label |
jmp label |
JE |
label |
je label |
JNE |
label |
jne label |
JG |
label |
jg label |
JL |
label |
jl label |
JGE |
label |
jge label |
JLE |
label |
jle label |
CALL |
label |
call label |
RET |
- | ret |
| Pseudo-instruction | Operands | Expansion |
|---|---|---|
PUSH |
src |
push src |
POP |
dst |
pop dst |
| Pseudo-instruction | Operands | Expansion |
|---|---|---|
SYSCALL |
- | syscall |
INT |
n |
int n |
| Pseudo-instruction | Expansion |
|---|---|
NOP |
nop |
Any line not recognized as a pseudo-instruction passes through with virtual registers substituted. You don't need to memorize which instructions are supported—you can write any x86-64 instruction using v0-v12 as registers. This lets you use raw x86 instructions directly:
| Instruction | Example | Notes |
|---|---|---|
movzx |
movzx v0, byte [v1] |
Zero-extend byte load |
div |
div v6 |
Unsigned divide rdx:rax by v6 |
shl / shr |
shl v0, 8 |
Shift left/right |
and / or / xor |
and v0, 0xFF |
Bitwise operations |
test |
test v0, v0 |
Set flags without write |
not / neg |
neg v0 |
Bitwise NOT / negate |
Virtual register substitution works inside passthrough, so div v6 becomes div r8.
These directives pass through with virtual register substitution applied:
SECTION .text,SECTION .data,SECTION .bssGLOBAL label,EXTERN labelDB,DW,DD,DQ,BYTE,WORD,DWORD,QWORDALIGN n,TYPE,SIZE,LENGTH
GAS-to-NASM conversion: Dot-prefixed directives (.section, .global, .globl, .data, .text, .bss) are automatically converted to NASM syntax (dot stripped, .globl -> global, .data -> section .data).
Both ; and # start inline comments. Quoted strings preserve ; and # as literals.
MOVI v0, 42 ; this is a comment
ADD v1, v0 # this is also a commentThe VAS preprocessor runs automatically whenever the source contains any preprocessor directive, even if no virtual registers are present. The supported directives are:
.include "utils.vas" ; Include from current directory or search path
.include <std/io> ; Include from package cache or VAS_PATHAutomatic Deduplication: VAS automatically prevents duplicate file inclusion based on absolute paths. The same file will only be expanded once, regardless of how many times it's included. This eliminates One Definition Rule (ODR) issues without requiring manual guards.
The .once directive (optional) documents that a file is designed for single inclusion:
; utils.vas
.once ; Optional: documents that this file is designed for single inclusion
.const SYS_write = 1
.macro print_str ptr
MOVI v0, SYS_write
SYSCALL
.endmNote: Since VAS already handles automatic deduplication at the file level, .once has no functional effect. It serves as documentation only.
For fine-grained control over code blocks, use .once begin <name> and .once end [<name>]:
; utils.vas - Block-level deduplication
.once begin constants
.const SYS_write = 1
.const BUFFER_SIZE = 1024
.once end constants
.once begin macros
.macro print_str ptr
MOVI v0, SYS_write
SYSCALL
.endm
.once end macros
; Later in the same file or another included file...
.once begin constants
; This block will be SKIPPED because "constants" was already included
.const SHOULD_NOT_APPEAR = 999
.once end constantsFeatures:
- Named Blocks: Each block must have a unique name for identification
- Deduplication: Blocks with the same name are only included once (first occurrence)
- Nesting Support: Blocks can be nested; inner blocks maintain their own deduplication state
- Name Validation: Optional name in
.once endis checked against the matching.once begin - Error Detection: Unmatched
.once endor missing block names produce clear error messages
Use Cases:
- Organize large header files into logical sections
- Conditionally include different implementations
- Prevent ODR issues for specific code blocks without affecting the entire file
Constants are pure text substitutions, defined with .const:
.const SYS_write = 1
.const BUFFER_SIZE = 1024
MOVI v0, SYS_write ; → MOVI v0, 1Defined constants are automatically available for .ifdef checks. Important: constant replacement only occurs in code regions, not inside quoted strings or comments.
Conditional compilation uses .ifdef / .ifndef / .else / .endif:
.const DEBUG = 1
.ifdef DEBUG
MOVI v0, 1 ; Debug mode code
.else
MOVI v0, 0 ; Release mode code
.endifOnly checks if a name is defined (via .const), does not support value comparison. Nested conditionals are supported, with proper handling of .else in true/false branches. The .else is ignored when the corresponding block is inside a skipped false branch.
.macro strlen ptr, len
MOVI \len, 0
.loop\@:
CMP [\ptr + \len], 0
JE .done\@
ADD \len, \len, 1
JMP .loop\@
.done\@:
.endm
strlen msg, v1 ; Expands with unique labels (.loop_1, .done_1)\param- Parameter substitution\@- Unique label generation (auto-incrementing counter)
Default parameter values (since v0.1.4): parameters can specify a default value using name=value syntax. When a macro is invoked, any omitted argument falls back to its default. Arguments provided by the caller always take precedence.
.macro log msg="info", level=1
db \msg, \level
.endm
log ; Expands to: db "info", 1
log "error", 3 ; Expands to: db "error", 3
log "warning" ; Expands to: db "warning", 1Parameters without a default are required. If a required argument is missing, VAS reports an error.
.rept 5
NOP
.endr
; Expands to 5 NOP instructionsNested rept blocks are supported. Inner .rept blocks expand correctly:
.rept 2
.rept 3
NOP
.endr
.endr
; Expands to 6 NOPs (2 × 3).include_bytes "data.bin"
; Converts to db directives with hex bytesEmbed author or copyright information that appears as a comment in the preprocessed output:
.author "Jane Hacker <jane@example.com>"This produces ; Author: Jane Hacker <jane@example.com> in the output.
Useful for identifying the origin of a file, especially when sharing
libraries via .include.
When including a package with angle brackets (.include <pkg>), VAS processes the package in a separate context. This means:
- Macros and constants defined in the package are not visible to the including file.
- The package cannot see macros/constants from the including file either.
- This isolation enforces modular boundaries – packages must be self-contained.
For sharing definitions across files, use file includes (.include "file.vas"), which process everything in the same context.
Cross-context deduplication still applies: the same package or file is never processed twice, even if included from multiple locations.
Lines ending with : that are not known pseudo-instructions pass through as labels with virtual register substitution:
section .data
result: dq 0
section .text
global _start
_start:VAS provides friendly, descriptive error messages that explain what went wrong and how to fix it. Errors include source context when reading from a file:
error at line 3:
MOVI v99, 42
^
line 3: "MOVI v99, 42": virtual register v99 out of range (valid: v0-v12)
- Virtual register out of range: reports the invalid name and valid range (v0-v12)
- Operand count mismatch: reports the original line and expected count
- Input file not found: reports the path
- Unknown instruction: passes through silently (only virtual register substitution applied)
vas # Read from stdin, output to stdout
vas input.vas # Translate input.vas, output to stdout
vas -o output.asm input.vas # Write to file
vas input.vas -o output.asm # Same as above
vas -target win64 input.vas # Output Windows x64 skeleton
vas -O1 input.vas # Enable -O1 optimizations
vas -O2 input.vas # Enable -O2 optimizations (includes -O1)
vas diff input.vas # Show VAS source vs NASM output
vas stats input.vas # Show instruction and register statistics
vas check input.vas # Validate syntax (exit: 0=ok, 1=error)
vas check --strict input.vas # Also fail on dangerous instruction patterns
vas list # List all instructions and syntax
vas version # Print versionOptions:
-o <file>- Write output to file instead of stdout-target <arch>- Target platform:elf64(default) orwin64-O1- Enable optimizations (constant folding, dead code elimination, peephole)-O2- Enable -O2 optimizations (LICM, CSE, redundant load elimination, PUSH/POP elimination, tail call)-v,--version- Print version and exit-h,--help- Show help--strict- In check mode, treat lint errors as failures
vas prep resolves all preprocessor directives (includes, macros, constants, conditionals) and outputs the fully expanded source. This is useful for debugging complex include chains or macro expansions. The same preprocessing step is performed automatically by vas build before assembly, so you don't need to prep separately unless you want to inspect the intermediate result.
Example:
vas prep app.vas
vas prep -v app.vas # show statisticsWhen the assembled output does not contain a section .text directive, VAS automatically wraps it in a minimal standalone skeleton that can be assembled and run directly. If the input already defines its own .text section, the skeleton is skipped.
echo "MOVI v0, 42" | vasOutput includes: default rel, section .text, global _start, _start: entry that calls vas_main and then performs exit(eax) via syscall. User code is placed under vas_main:.
If your code does not end with RET, SYSCALL, JMP, or HLT, VAS
automatically inserts a ret so the program exits cleanly.
echo "MOVI v0, 42" | vas -target win64Uses main: as the entry point. Ends with xor eax, eax; ret unless the user's last instruction is already RET.
If the assembled output already defines a .text section, output is passed through as-is without wrapping.
-O1 enables:
- Constant Folding: Computes literal arithmetic at compile time.
ADD v1, 1, 2becomesMOVI v1, 3. - Dead Code Elimination: Removes register writes that are never read before being overwritten.
- Peephole Optimizations:
mov reg, 0->xor reg, reg(smaller encoding)cmp reg, 0->test reg, reg(smaller encoding)- Multi-nop sequences merged into one
mov + addfused intolea
-O2 includes all -O1 passes plus:
- Common Subexpression Elimination (CSE): Repeats of (op, arg1, arg2) replaced with MOV from the first result.
- Loop Invariant Code Motion (LICM): LEA with label operand hoisted before loop header.
- Redundant Load Elimination: LOAD from same address replaced with MOV from previous load.
- PUSH/POP Elimination: Balanced push/pop pairs removed when the register is unmodified.
- Tail Call Optimization:
CALL label; RET->JMP label.
| File | Description | Instructions Used | Complexity |
|---|---|---|---|
hello.vas |
Linux write syscall | MOVI, LEA, SYSCALL | ★☆☆ |
calc.vas |
Sum 1..n arithmetic | MOVI, ADD, CMP, JLE, MOV, SYSCALL | ★☆☆ |
fib.vas |
Iterative Fibonacci | MOVI, MOV, ADD, CMP, JGE, JMP | ★★☆ |
fact.vas |
Recursive factorial | CALL, RET, PUSH, POP | ★★☆ |
sort.vas |
Bubble sort of 8 elements | LOAD, STORE, CMP, JMP, PUSH, POP | ★★☆ |
greet.vas |
CLI tool with args | POP, CMP, STORE, LOAD, SYSCALL | ★★☆ |
win-ops.vas |
Win64 arithmetic chain | ADD, MUL, SUB, RET | ★☆☆ |
win-edge.vas |
Win64 edge cases | PUSH, POP, STORE, LOAD, CMP, JE, RET | ★★☆ |
multitool.vas |
Multi-function demo | strlen, Fibonacci, prime, factorial | ★★★ |
Build and run Linux examples:
vas examples/fib.vas -o fib.asm && nasm -f elf64 fib.asm -o fib.o && ld fib.o -o fib && ./fib; echo $?Build and run Windows examples:
vas -target win64 examples/win-ops.vas -o win-ops.asm
nasm -f win64 win-ops.asm -o win-ops.obj
ld -e main -o win-ops.exe win-ops.obj
win-ops.exePrerequisites: Go 1.21+, no third-party dependencies.
# Clone
git clone https://github.com/0xA672/Vas.git
cd vas
# Build (dev version prints "vas dev")
go build -o vas.exe .
# Build with version string
go build -ldflags "-X main.Version=v0.2.0" -o vas.exe .
# Install to $GOPATH/bin
go installvas version and vas -v print the embedded version string.
vas/
+-- main.go # CLI entry, argument parsing, subcommands
+-- go.mod # Go module
+-- vas/
| +-- core.go # Core translation: scan -> expand -> wrap (includes regMap)
| +-- prep.go # Preprocessor: includes, macros, constants, conditionals
| +-- lint/
| | +-- lint.go # Linting and static analysis
| +-- opt/
| | +-- opt.go # -O1 / -O2 optimizer passes
| +-- arch/
| +-- arch.go # Architecture-specific target definitions (elf64, win64)
+-- test/
| +-- assembler_test.go # Unit tests for assembler
| +-- invariant_test.go # Property-based invariant tests
+-- testdata/
| +-- golden/ # Golden test outputs
+-- examples/ # Example .vas files
+-- wasm/ # WebAssembly playground support
+-- bin/ # Build artifacts (gitignored)
+-- README.md
+-- LICENSE
VAS explicitly does not perform:
- Register allocation or instruction scheduling
- Instruction selection or optimization (beyond simple -O1)
- Linking or relocation
Generated .asm files must be assembled by NASM and linked by ld to produce an executable. VAS is a thin translation layer; NASM handles the rest.
VAS is designed for learning, prototyping, and small utilities. For production code or performance-critical sections, consider writing NASM directly or using a higher-level language with inline assembly.
MIT - see LICENSE file.