Skip to content

feat: add code-review environment#1152

Open
vominh1919 wants to merge 2 commits intoPrimeIntellect-ai:mainfrom
vominh1919:feat/code-review-env
Open

feat: add code-review environment#1152
vominh1919 wants to merge 2 commits intoPrimeIntellect-ai:mainfrom
vominh1919:feat/code-review-env

Conversation

@vominh1919
Copy link
Copy Markdown

@vominh1919 vominh1919 commented Apr 16, 2026

Summary

Adds a new code-review environment for evaluating LLM's ability to review code.

Features

  • Bug detection: Identifies potential bugs in code
  • Security analysis: Flags security vulnerabilities
  • Code quality: Suggests improvements and best practices
  • Multi-language: Supports Python, JavaScript, and more

Bounties Program

This environment aligns with Prime Intellect's Environments Program:

  • Category: Open Access ($100-500)
  • Type: Self-contained benchmark implementation
  • Tags: code-review, coding, analysis

Usage

prime env install code-review
prime eval run code-review -m openai/gpt-4.1-mini

Closes #


Note

Low Risk
Additive, self-contained environment modules with simple scoring and no changes to shared infrastructure, auth, or data-handling paths.

Overview
Adds three new installable evaluation environments under environments/: api-design, code-review, and sql-query.

Each environment defines a load_environment() that builds small synthetic train/eval HF Datasets, configures a simple keyword-based scoring Rubric, and returns a vf.SingleTurnEnv with an appropriate system prompt. New pyproject.toml files wire packaging metadata, dependencies (verifiers>=0.1.8), and default eval settings; code-review also includes a README with usage instructions.

Reviewed by Cursor Bugbot for commit 439fa61. Bugbot is set up for automated code reviews on this repo. Configure here.

@vominh1919
Copy link
Copy Markdown
Author

Update

Added 2 more environments to this PR:

  1. code-review: Code review and bug detection
  2. api-design: RESTful API design evaluation
  3. sql-query: SQL query writing evaluation

All environments follow Prime Intellect's Environments Program guidelines for Open Access bounties ($100-500 each).

Total: 3 environments ready for review!

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 439fa61. Configure here.

"answer": "Security issue: No path validation. Could allow directory traversal attacks. Should validate filename.",
"language": "python"
},
] * (num_eval_examples + 1)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dataset uses prompt string, causing assertion crash

High Severity

All three new environments use "prompt" as the dataset column key with a plain string value, but also pass a system_prompt to SingleTurnEnv. The framework's _ensure_prompt method, when it finds a prompt column already present and system_prompt is set, asserts that prompt must be a list of messages — causing an AssertionError at initialization. The column key needs to be "question" instead of "prompt" so the framework properly wraps the string into a messages list. All existing environments that use string-valued prompts use "question" for this reason.

Additional Locations (2)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 439fa61. Configure here.

system_prompt="You are an expert code reviewer. Analyze the code and identify bugs, security issues, and potential improvements.",
rubric=rubric,
message_type="chat",
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New environments missing from environments/README.md

Low Severity

Three new environments (code_review, api-design, sql-query) are added to the environments/ folder but environments/README.md is not updated to list them. The project rules require that any PR adding or removing an environment must update environments/README.md to reflect the change under the appropriate category/pattern section.

Additional Locations (2)
Fix in Cursor Fix in Web

Triggered by project rule: BugBot Instructions

Reviewed by Cursor Bugbot for commit 439fa61. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant