Skip to content

Fix/img gen format support OpenAI#1223

Open
anhvanhoa wants to merge 6 commits into
nextlevelbuilder:devfrom
anhvanhoa:fix/img_gen_format_support_openai
Open

Fix/img gen format support OpenAI#1223
anhvanhoa wants to merge 6 commits into
nextlevelbuilder:devfrom
anhvanhoa:fix/img_gen_format_support_openai

Conversation

@anhvanhoa

@anhvanhoa anhvanhoa commented Jun 13, 2026

Copy link
Copy Markdown

Summary

Type

  • Feature
  • Bug fix
  • Hotfix (targeting main)
  • Refactor
  • Docs
  • CI/CD

Target Branch

Checklist

  • go build ./... passes
  • go build -tags sqliteonly ./... passes (if Go changes)
  • go vet ./... passes
  • Tests pass: go test -race ./...
  • Web UI builds: cd ui/web && pnpm build (if UI changes)
  • No hardcoded secrets or credentials
  • SQL queries use parameterized $1, $2 (no string concat)
  • New user-facing strings added to all 3 locales (en/vi/zh)
  • Migration version bumped in internal/upgrade/version.go (if new migration)

Test Plan

anhvn and others added 6 commits June 12, 2026 14:33
- Introduced a new `custom_tools` table to support user-defined shell commands with multi-tenant capabilities.
- Implemented CRUD operations for custom tools in the backend.
- Added UI components for creating, editing, and listing custom tools.
- Integrated custom tools into the sidebar and routing.
- Enhanced form validation for custom tool parameters using Zod.
- Added necessary migrations for the new schema.
- Updated the application version to reflect the new features.
- Updated the custom tools data model to replace the single agent ID with an array of agent IDs, allowing tools to be assigned to multiple agents.
- Modified API endpoints and request/response structures to accommodate the new agent IDs array.
- Enhanced the UI components to support multi-agent selection, including a new MultiAgentPicker component for better user experience.
- Updated localization files to reflect changes in agent assignment terminology.
- Added database migrations to transition from the old agent ID structure to the new agent IDs array.
@mrgoonie

Copy link
Copy Markdown
Contributor

PR Review

Verdict: REQUEST CHANGES

Summary

  • The PR title says image generation format support for OpenAI, but the diff also introduces a large unrelated Custom Tools feature: DB migrations, HTTP/RPC CRUD, UI, runtime registration, and shell-command execution.
  • On the specific question: yes, ChatGPT subscription / Codex-backed image generation already exists on dev. create_image can already route through OAuth/Codex native image generation with parent models such as gpt-5.5 / gpt-5.4 and the image_generation tool. This PR does not introduce that capability; it mostly changes output_format handling and removes response_format for standard OpenAI image generation.

Findings

  1. [blocking] PR scope/title mismatch — this mixes a small image-generation compatibility change with ~2.7k lines of Custom Tools feature work. Please split this into separate PRs:

    • one small PR for create_image OpenAI output format / response parsing behavior
    • one dedicated PR for Custom Tools, with its own product/security review and tests
  2. [blocking/security] internal/http/custom_tools.go exposes decrypted custom-tool environment variables via GET /v1/tools/custom/{id}/env. Even though this is admin/master scoped, this turns write-only secrets into readable API output. Existing comments say env vars are never populated on reads / write-only encrypted at rest, but this route violates that model. Please remove this endpoint or return only redacted keys/metadata.

  3. [blocking/security] internal/tools/custom_tool.go executes user-defined commands through sh -c / cmd /C, with template-expanded arguments inserted directly into the command string. That is a high-risk command injection surface by design. If this feature is kept, it needs a much tighter execution model: no shell by default, explicit binary + args array, path allowlist, permission/grant flow, audit trail, and negative tests for injection.

  4. [blocking/correctness] custom tools are registered into the in-memory global tool registry by name, but agent scoping via agentIds is not enforced at execution/listing time in the tool registry path. A tool assigned to one agent can become visible/executable as a normal registered tool unless another layer filters it perfectly. Please enforce assignment at tool-resolution/execution time, not only in UI metadata.

  5. [blocking/correctness] registry unregister/update can leave stale tools behind after rename. handleUpdate fetches the updated record and unregisters updated.Name only when disabled; if a tool is renamed from old_name to new_name, the old registry entry can remain registered. Same class of issue applies to re-registering without removing the previous name.

  6. [medium] create_image now accepts output_format, but saved files and MIME are still hardcoded as png: mediaFileName(..., "png"), MimeType: "image/png", and prompt metadata embedding assumes PNG. If output_format is webp or jpg, the returned file can have the wrong extension/MIME and PNG metadata embedding will be a no-op. Either keep persisted output normalized to PNG or propagate actual format/MIME correctly.

Existing capability check

Confirmed on dev:

  • internal/providers/codex_native_image.go already implements GenerateImage for Codex/OAuth-backed providers using the ChatGPT Responses API image_generation tool.
  • internal/providers/native_image.go already documents parent LLM model examples like gpt-5.5, and defaults the image model to gpt-image-2.
  • internal/tools/create_image.go already routes native providers before the API-key credential path via _native_provider / providers.NativeImageProvider.

So if the intent is "use ChatGPT subscription to generate images", this PR is not the right unit of work. That feature is already present; we should not merge a large custom-tools feature under an image-format PR.

Anti-AI-Slop Check

  • Intent clear: no. PR body is empty and title does not match most of the diff.
  • Diff scope justified: no. ~2.7k LOC and migrations for an unrelated feature.
  • Refactor/feature mix justified: no.
  • Author explanation needed: yes.

Tests / CI

  • GitHub reports no checks for this branch.
  • Local targeted tests run:
    • go test ./internal/tools ./internal/providers -run 'CreateImage|CodexGenerateImage|ChatGPTOAuthRouter' -count=1 passed.
    • go test ./internal/http ./internal/gateway/methods ./internal/store/pg ./internal/store/sqlitestore -run CustomTool -count=1 could not fully run because internal/store/sqlitestore has build constraints excluding files without the proper tag; the other listed packages had no relevant tests.
  • Missing tests for Custom Tools security boundaries, env redaction, command injection, agent assignment enforcement, registry rename/unregister behavior, and tenant isolation.

Final Notes

Please split the PR and keep the image-format fix minimal. For the Custom Tools feature, it needs a separate design/security review before it should be mergeable.

@mrgoonie mrgoonie added agent:github-maintain Processed by github-maintain automation status:blocked Blocked by external dependency or decision maintain:triaged Triaged by maintain workflow labels Jun 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent:github-maintain Processed by github-maintain automation maintain:triaged Triaged by maintain workflow status:blocked Blocked by external dependency or decision

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants