[RFC]: Develop a Project Test Runner for stdlib

# [RFC]: Develop a Project Test Runner for stdlib

---

## Your Background

**Full name**
Kartikeya Sharma

**University status**
Yes 

**University name**
ABES Engineering College

**University program**
B.Tech in Computer Science Engineering

**Expected graduation**
2027

**Short biography**
I am a Computer Science undergraduate focused on backend systems and developer tooling. My primary stack is JavaScript and Node.js, with experience building systems that emphasize correctness, failure handling, and deterministic behavior.

I am particularly interested in infrastructure-level problems — testing systems, execution semantics, and reliability under edge cases. I have contributed to open-source projects and worked through review cycles while adapting to project conventions.

**Timezone**
India Standard Time (IST, UTC+5:30)

**Contact details**
email: kartikeya9917@gmail.com, github: https://github.com/Kartikeya-guthub

---

## Programming Experience

**Platform**
Windows

**Editor**
VSCode — because of its rich extension ecosystem, integrated terminal, and excellent JavaScript/Node.js debugging support.

**Programming experience**
I have experience building backend systems and developer tooling in Node.js, with a focus on correctness, deterministic execution, and failure handling.

My work involves designing event-driven systems with explicit control over execution flow, implementing idempotent processing for at-least-once delivery guarantees, and ensuring consistency under retries and partial failures. I have handled concurrency control, state transitions, and ordering constraints in distributed systems.

I am familiar with managing asynchronous execution in Node.js, including event loop behavior, timers, and callback-based workflows. I have worked with systems that require precise lifecycle control and predictable outcomes despite asynchronous operations.

I have also developed CLI-based tooling and modular codebases with clear separation between execution, processing, and reporting layers. This includes designing structured pipelines, validating inputs, and producing deterministic outputs suitable for automation.

Across my work, I focus on:
- deterministic execution under asynchronous conditions  
- lifecycle management and completion guarantees  
- failure modeling, retries, and idempotency  
- structured system design with explicit execution flow  

These capabilities directly align with building a minimal, deterministic test runner with strict lifecycle control and reliable execution semantics.

**JavaScript experience**
I primarily use JavaScript for backend and tooling work. My favorite feature is its **event-driven, non-blocking I/O model** which makes it well-suited for building test runners and execution queues. My least favorite feature is the implicit type coercion — especially in equality comparisons — which can produce subtle, hard-to-debug bugs.

**Node.js experience**
I use Node.js for building CLI tools, backend systems, and scripting. I am familiar with the module system, the event loop, streams, process lifecycle, and child process management — all of which are directly relevant to building a test runner.

**C/Fortran experience**
Limited experience. I have read and understood C-level code within stdlib (such as native add-ons and benchmark files) while navigating the codebase, but my primary language is JavaScript/Node.js.

**Interest in stdlib**
I am interested in stdlib due to its focus on correctness, minimalism, and well-structured utilities. While exploring the codebase and contributing, I found the testing and benchmarking infrastructure particularly interesting — especially `@stdlib/bench/harness`. Building a minimal, internally owned test runner that replaces external dependencies aligns directly with stdlib's design philosophy and my interest in systems-level infrastructure.

**Version control**
Yes 

**Contributions to stdlib**

- **PR — Fix JavaScript lint errors in benchmark (ztest2)** — *Merged*
  https://github.com/stdlib-js/stdlib/pull/11199
  This contribution helped me understand repository structure, linting conventions, and the contribution review workflow. It also gave me direct exposure to how stdlib test and benchmark files are structured — which informs my design decisions for this proposal.

**stdlib showcase**

- **Event Metrics Analyzer**
  https://github.com/Kartikeya-guthub/event-metrics-analyzer-stdlib
  A system that simulates backend request latency and failures and computes statistical insights using stdlib modules. Demonstrates real-world usage of stdlib in a system-oriented context, integrating multiple stdlib utilities for statistical computation and data analysis.

---

## Project Description

### Goals

**Project Abstract:**

Design and implement a minimal, stdlib-owned test runner to replace `tape` — preserving existing test behavior while enabling incremental, automated migration with no additional runtime dependencies.

The proposed system exposes two entry points:
- `@stdlib/test/compat` — a compatibility layer for incremental migration from tape
- `@stdlib/test/runner` — the final minimal test runner API

This separation enables a smooth transition without requiring immediate refactoring of existing test files.

The runner will provide TAP-compatible output along with structured reporting suitable for CI integration.

**Project Size:** Large (350 hours)

---

**Understanding of stdlib Test Usage**

Before designing the API, I analyzed stdlib test files to understand what is actually used. The findings directly inform the assertion surface.

Assertions observed in practice:

| Assertion | Frequency |
|---|---|
| `t.equal` | Very common |
| `t.deepEqual` | Common |
| `t.ok` | Common |
| `t.throws` | Occasional |
| `t.end` | Every test |

Key observations:
- The assertion surface is intentionally small. Based on auditing representative stdlib packages, most tests rely on only 3–4 assertion types.
- `t.strictEqual` is redundant given how `t.equal` behaves in stdlib usage — it will **not** be included.
- Messages are often repetitive (`"returns expected value"`). Making `msg` optional with a sensible default reduces boilerplate without changing semantics.
- String construction in assertions often uses manual concatenation — this should be normalized to `@stdlib/string/format` for consistency.

---

**Proposed Assertion Surface**

Supported:

```js
t.ok( value, msg )
t.equal( actual, expected, msg )
t.deepEqual( actual, expected, msg )
t.throws( fn, msg )
t.end()
```

Explicitly excluded:
- `t.strictEqual` — redundant, not aligned with stdlib usage patterns
- `t.plan` — adds unnecessary coupling between test intent and count
- Nested tests — not present in stdlib test files
- Hooks (`beforeEach` / `afterEach`) — out of scope for minimal runner
- Parallel execution — determinism is a hard requirement

---

**Developer Experience Improvements**

1. Optional Default Messages — Most stdlib tests pass `"returns expected value"` as the message. The runner will make `msg` optional and supply a default internally:

```js
// Current (tape)
t.equal( actual, expected, 'returns expected value' );

// With runner — message is optional
t.equal( actual, expected );
// internally defaults to: 'assertion <N> passed'
```

2. String Formatting via `@stdlib/string/format` — Assertion error messages currently use manual concatenation. The runner's internal error formatting will use `@stdlib/string/format` for consistency:

```js
format( 'expected %s, got %s', expected, actual )
```

---

**Before / After Example**

BEFORE — using `tape`:

```js
var tape = require( 'tape' );

tape( 'returns expected value', function test( t ) {
    var actual = add( 2, 3 );
    t.equal( actual, 5, 'returns expected value' );
    t.end();
});
```

AFTER — using stdlib runner:

```js
var test = require( '@stdlib/test/runner' );

test( 'returns expected value', function test( t ) {
    var actual = add( 2, 3 );
    t.equal( actual, 5 );
    t.end();
});
```

Migration impact:
- Replace `require( 'tape' )` with `require( '@stdlib/test/runner' )`
- Message argument on assertions becomes optional (can be removed or left as-is)
- No structural changes required to test logic or lifecycle

---

**Technical Design**

Execution Model — The runner maintains an internal test queue. Execution is strictly sequential and deterministic by design.

```
┌──────────────┐
│  test files  │  ← test() calls register entries
└──────┬───────┘
       │
       ▼
┌──────────────┐
│  test queue  │  ← FIFO, push on register
└──────┬───────┘
       │
       ▼
┌──────────────────────────────────┐
│  executor                        │
│  - dequeue one test              │
│  - invoke callback               │
│  - await t.end() or timeout      │
│  - collect pass/fail results     │
│  - move to next                  │
└──────┬───────────────────────────┘
       │
       ▼
┌──────────────┐
│  reporter    │  ← TAP-compatible output
└──────┬───────┘
       │
       ▼
┌──────────────┐
│  exit(code)  │  ← 0 if all pass, 1 if any fail
└──────────────┘
```

Lifecycle Per Test:
1. **Register** — `test( name, fn )` pushes to queue
2. **Start** — executor dequeues and calls `fn( t )`
3. **Assert** — assertions record pass/fail, do not throw by default
4. **End** — `t.end()` marks test complete; executor moves to next
5. **Timeout** — if `t.end()` is never called within threshold, test is marked failed

Async Handling — The runner wraps each test callback in a controlled executor. Async tests work naturally since execution does not advance until `t.end()` is called:

```js
test( 'async test', function test( t ) {
    setTimeout( function() {
        t.equal( result, expected );
        t.end(); // runner only proceeds after this
    }, 100 );
});
```

Runner Components:

1. **Core Engine** (`runner/lib/runner.js`) — Maintains test queue, enforces sequential execution, manages test lifecycle
2. **Assertion Layer** (`runner/lib/assert.js`) — Minimal API surface, non-throwing by default, optional message with internal default
3. **Reporter** (`runner/lib/reporter.js`) — TAP-compatible output, CI-friendly exit codes, deterministic line ordering
4. **CLI** (`runner/bin/cli.js`) — `runner test.js` or `runner 'test/**/*.js'`

---

**Migration Strategy**

Step 1 — Compatibility Shim: A thin shim maps the `tape` API to the runner API, allowing files to run under the new runner without modification.

Step 2 — Codemod Automation: AST-based codemod script automates transformations:
- `require( 'tape' )` → `require( '@stdlib/test/runner' )`
- `t.strictEqual( a, b, m )` → `t.equal( a, b, m )`

Step 3 — Dual-Run Validation: Each migrated package is validated by running its test suite under both `tape` and the new runner and comparing pass/fail results, assertion counts, and output format.

Step 4 — Incremental Package Migration: Low-risk packages first, high-complexity packages last.

Step 5 — Tape Removal: Once full parity is confirmed, `tape` is removed from `package.json` dependencies.

Risk Mitigation:

| Risk | Mitigation |
|---|---|
| Behavioral mismatch between tape and runner | Dual-run validation before each package migration |
| Migration scale (many test files) | Codemod automation reduces manual effort |
| CI breakage | TAP-compatible output format preserved; exit codes unchanged |
| Async edge cases | Timeout handling catches missing `t.end()` calls |

---

### Why this project?

While exploring stdlib and working on contributions, I observed that reliance on `tape` and external scripts introduces friction when enforcing consistent testing patterns across the repository.

Building an in-house runner aligns with stdlib's design philosophy and improves long-term maintainability. The project is interesting because it combines systems design, testing infrastructure, and real-world migration at scale — all areas I actively work in.

---

### Qualifications

- Strong JavaScript and Node.js background with focus on backend systems and developer tooling
- Hands-on experience with stdlib's codebase through contributions and exploration
- Familiarity with test runner internals: execution queues, assertion semantics, lifecycle management, TAP output format
- Experience with AST-based code transformation tooling (relevant for codemod automation)
- Direct exposure to stdlib test and benchmark file structure through my open PR

---

### Prior art

| Runner | Style | Notes |
|---|---|---|
| `tape` | Minimal, stream-based | Current dependency; being replaced |
| `jest` | Integrated, feature-rich | Too heavy for stdlib's philosophy |
| `mocha` | Flexible, plugin-based | External dependencies, not minimal |
| `@stdlib/bench/harness` | Internal, minimal | Direct inspiration for design approach |

The stdlib runner follows `tape`'s minimal philosophy while being internally owned, convention-aligned, and free of external runtime dependencies — similar in spirit to `@stdlib/bench/harness`.

---

### Commitment

- **Availability:** 30–40 hours/week throughout the GSoC period
- **No conflicting commitments** during the program
- Active communication on Zulip and GitHub
- Available before the coding period for design review and early feedback

---

### Schedule

Assuming a 12 week schedule,

- **Community Bonding Period**: Deep-dive into stdlib test file patterns across packages. Finalize runner API design with mentor feedback. Set up repository scaffold and project structure. Read relevant documentation and review `@stdlib/bench/harness` implementation.

- **Week 1**: Analyze test patterns across stdlib packages. Finalize runner API design and project structure. Set up repository scaffold.

- **Week 2**: Implement core execution engine — test queue, sequential executor, lifecycle management (`t.end()` tracking, timeout handling).

- **Week 3**: Implement assertion layer (`ok`, `equal`, `deepEqual`, `throws`, `end`). Implement compatibility shim. Write unit tests for the runner itself.

- **Week 4**: Implement CLI and test discovery. Validate runner against representative packages. Compare output with tape execution.

- **Week 5**: Stabilize TAP output format and exit semantics. Begin dual-run validation across selected test suites.

- **Week 6** *(midterm)*: Midterm deliverables — working test runner with complete execution lifecycle, full assertion surface implemented, compatibility shim for drop-in tape replacement, CLI-based execution working, verified behavioral parity on sample packages.

- **Week 7**: Expand compatibility coverage. Resolve behavioral differences found during dual-run. Begin migration of low-risk packages.

- **Week 8**: Continue package-by-package migration. Integrate runner into repository scripts and CI. Validate in CI environments.

- **Week 9**: Continue migration. Handle edge cases and async test patterns discovered during migration.

- **Week 10**: Complete migration of remaining packages. Remove tape from standard unit testing workflows.

- **Week 11**: Code freeze — focus on completing tests and documentation. Finalize migration notes and codemod tooling guide.

- **Week 12**: Final cleanup, documentation review, and buffer for any remaining issues.

- **Final Week**: Submit project. Write final report. Ensure all PRs are merged or in review.

---

### Related issues

- [Develop a project test runner — Issue #7](https://github.com/stdlib-js/google-summer-of-code/issues/7)

---

## Checklist

- [x] I have read and understood the Code of Conduct.
- [x] I have read and understood the application materials found in this repository.
- [x] I understand that plagiarism will not be tolerated, and I have authored this application in my own words.
- [x] I have read and understood the patch requirement which is necessary for my application to be considered for acceptance.
- [x] I have read and understood the stdlib showcase requirement which is necessary for my application to be considered for acceptance.
- [x] The issue name begins with `[RFC]:` and succinctly describes my proposal.
- [x] I understand that, in order to apply to be a GSoC contributor, I must submit my final application to https://summerofcode.withgoogle.com/ before the submission deadline.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC]: Develop a Project Test Runner for stdlib #220

[RFC]: Develop a Project Test Runner for stdlib

Your Background

Programming Experience

Project Description

Goals

Why this project?

Qualifications

Prior art

Commitment

Schedule

Related issues

Checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Assertion	Frequency
`t.equal`	Very common
`t.deepEqual`	Common
`t.ok`	Common
`t.throws`	Occasional
`t.end`	Every test

Risk	Mitigation
Behavioral mismatch between tape and runner	Dual-run validation before each package migration
Migration scale (many test files)	Codemod automation reduces manual effort
CI breakage	TAP-compatible output format preserved; exit codes unchanged
Async edge cases	Timeout handling catches missing `t.end()` calls

Runner	Style	Notes
`tape`	Minimal, stream-based	Current dependency; being replaced
`jest`	Integrated, feature-rich	Too heavy for stdlib's philosophy
`mocha`	Flexible, plugin-based	External dependencies, not minimal
`@stdlib/bench/harness`	Internal, minimal	Direct inspiration for design approach

[RFC]: Develop a Project Test Runner for stdlib #220

Description

[RFC]: Develop a Project Test Runner for stdlib

Your Background

Programming Experience

Project Description

Goals

Why this project?

Qualifications

Prior art

Commitment

Schedule

Related issues

Checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions