Releases: mverab/Reposcale
Releases · mverab/Reposcale
v0.1.0-alpha — public alpha for repo continuation evaluation
RepoScale is now public as an alpha benchmark for repo continuation intelligence.
This release packages a runnable evaluation pipeline for measuring how well models understand and continue existing software projects instead of isolated coding tasks.
What ships in this alpha
- 12 curated cases across Diagnose, Intent, and Plan
- CLI workflow: validate, batch, score, summary, compare
- Structural, heuristic, and LLM-judge scoring
- Judge stability support with repeated scoring
- Corpus manifest, case authoring guide, and baseline scripts
- GitHub Actions CI
Good use cases right now
- Evaluate models on repo understanding and continuation-style tasks
- Run small reproducible baselines across providers
- Contribute new cases and scoring improvements
Important alpha caveats
- Methodology is still evolving
- Judge neutrality is not yet benchmark-grade final
- Corpus size is useful but still small
Start here: