Skip to content

feat(agent): add shell MCP smoke artifact command #919

@hansjm10

Description

@hansjm10

Sequence

Order: 11 of 12
Phase: Phase 4 - Validation Loop Automation
Design: docs/agent-first-workflow-design.md
Tracker: #908
Depends on: #918

Agent Role

Shell Validation Agent

Scope

  • Add pnpm agent:shell-smoke to run a bounded shell-desktop MCP validation loop and write artifacts.
  • Capture MCP health, sim status, pause/step behavior, renderer status, bounded logs, WebGPU health, screenshot metadata, and a summary JSON.
  • Manage only processes started by the command and leave existing user-managed daemons alone.

Context Packet

  • docs/shell-desktop-mcp.md
  • docs/agent-first-workflow-design.md
  • packages/shell-desktop/README.md
  • tools/scripts/shell-desktop-mcp-smoke.mjs
  • tools/scripts/start-shell-desktop-headless.sh
  • tools/scripts/start-shell-desktop-mcp-gateway-daemon.sh
  • tools/scripts/status-shell-desktop-mcp-gateway-daemon.sh
  • tools/scripts/stop-shell-desktop-mcp-gateway-daemon.sh

Acceptance Criteria

  • pnpm agent:shell-smoke exists.
  • The command writes artifacts under artifacts/agent-runs/<run-id>/ or another documented ignored path.
  • The command writes summary.json with command outcomes, duration, branch/SHA, and validation status.
  • The command captures health, sim status, renderer status, bounded logs, WebGPU health, and screenshot bytes/path when available.
  • Timeouts and backend-unavailable states are reported clearly.
  • The command does not kill processes it did not start.

Validation

  • pnpm --filter @idle-engine/shell-desktop test:ci
  • pnpm shell:desktop:mcp:smoke
  • pnpm agent:shell-smoke

Notes

  • If the host cannot run Electron/xpra, document the exact blocker and keep unit tests around artifact generation and lifecycle ownership.

Refs #906

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions