🤖 Cross-Engine Agent Team — Orchestrate Multi-Engine Collaborative Workflows

## 💡 Vision

CodeMux already unifies multiple AI coding engines (OpenCode, Copilot, Claude Code) under a single gateway. The next leap is enabling these engines to **work together as a team** — automatically decomposing complex tasks and distributing subtasks across engines/sessions in parallel, each working in an isolated environment, then aggregating results back to the user.

This is not about wrapping a single engine's sub-agent system (like Claude Code's internal Agent tool). Instead, it's about building a **cross-engine orchestration layer** that is unique to CodeMux's multi-engine architecture — something no single-engine tool can achieve.

## 🎯 Core Concept

```
User sends a complex task (e.g., "Refactor auth module and add tests")
      ↓
┌──────────────────────────────────────────────────────┐
│            Orchestrator (new layer)                   │
│  1. Analyze & decompose task                         │
│  2. Assign subtasks to engines/sessions              │
│  3. Monitor progress                                 │
│  4. Collect & synthesize results                     │
└──┬──────────────┬──────────────┬─────────────────────┘
   │              │              │
   ▼              ▼              ▼
┌────────┐   ┌────────┐   ┌────────┐
│ Engine │   │ Engine │   │ Engine │   ← Same or different engines
│ Claude │   │Copilot │   │OpenCode│
│ wt-1   │   │ wt-2   │   │ wt-3   │   ← Isolated worktrees
│"search │   │"write  │   │"run    │
│ & plan"│   │ tests" │   │ build" │
└────┬───┘   └────┬───┘   └────┬───┘
     │            │            │
     └────────────┼────────────┘
                  ▼
       Orchestrator merges results
                  ↓
         Unified response to user
```

## 🧭 Claude Agent Team vs CodeMux Orchestration

Claude Code already has built-in Agent Team capabilities (`AgentInput.team_name`, `agents`, `TeammateIdle` hook) for sub-agent parallelism **within a single session**. This is fundamentally different from what CodeMux orchestration provides:

| | Claude Agent Team | CodeMux Orchestration |
|---|---|---|
| Orchestration layer | **Inside** Claude session | CodeMux **application layer** |
| Engine scope | Claude only | **Cross-engine** (Claude + Copilot + OpenCode) |
| Context | Sub-agents share parent session context | Sessions are **fully independent**, Orchestrator injects context |
| Isolation | Optional worktree | Each subtask gets an **independent worktree** |
| User control | Claude decides internally | Users **review and edit** the decomposition plan in UI |

**Conclusion**: Intra-engine parallelism is deferred to Agent Team's own mechanism. CodeMux focuses on **cross-engine orchestration** and **user-controlled task decomposition**. Even with a single engine, CodeMux provides true file-level worktree isolation that Agent Team cannot achieve.

## ✨ Roadmap

### Phase 0: Sub-agent Visibility ✅ (PR #99, merged)

Capture previously-dropped SDK messages (`task_started`, `task_progress`, `task_notification`, `tool_progress`) in `ClaudeCodeAdapter` and surface real-time sub-agent activity in the UI:

- `RunningToolCard` shows current subtool name and tool-use count during execution
- `TaskTool` completed state displays AI-generated summary and tool-use stats
- Status bar appends active subtool name (e.g., "Delegating work · Fix the bug (Bash)")

### Phase 1: Cross-Engine Task Orchestration 🚧

#### Core Architecture: Hub-and-Spoke

Sessions **do not communicate directly**. All information flows through the Orchestrator (an LLM session):

```
                    ┌─────────────────────────────┐
                    │       Orchestrator           │
                    │       (LLM session)          │
                    │                              │
                    │  • Holds global context       │
                    │  • Manages DAG dependency     │
                    │  • Injects upstream results   │
                    │  • Decides next round actions │
                    └──┬──────┬──────┬─────────────┘
                       │      │      │
           results up↑ │  ↓dispatch  │  ↑results  │↓dispatch + inject context
                       │      │      │
                 ┌─────┴┐  ┌──┴───┐ ┌┴──────┐
                 │ S1   │  │ S2   │ │ S3    │
                 │Claude│  │Copilot│ │Claude │
                 │wt-1  │  │wt-2  │ │wt-3   │
                 └──────┘  └──────┘ └───────┘
```

#### Execution Model: DAG + Multi-Round Iteration

Rather than simple "parallel dispatch → wait all → aggregate", execution follows a **dependency graph** with iterative rounds:

```
Example: "Refactor auth module, add tests, then verify everything passes"

LLM decomposes into a DAG:

  [Analyze arch] ──→ [Write tests] ──→ [Verify integration]
  (Claude)           (Copilot)         (Claude)
       └──────────→ [Refactor impl] ─↗
                    (Claude)

  Round 1            Round 2            Round 3
  (no dependencies)  (parallel, deps    (waits for
                      satisfied)         all above)
```

**Execution loop**:
```
while (unfinished subtasks exist) {
  1. Find all subtasks whose dependencies are satisfied but not yet started
  2. Collect upstream result summaries, inject into downstream prompts
  3. Dispatch these subtasks in parallel (create worktree + session + send message)
  4. Wait for any running subtask to complete
  5. Extract completed subtask's result summary
  6. Go back to step 1
}
Aggregate final results
```

**Context injection** (the core mechanism for inter-session communication):

When a downstream subtask starts, the Orchestrator injects upstream results into its prompt:

```
Message sent to the "Write tests" session:

Your task: Write unit tests for the auth module

## Upstream Task Results

### [Analyze architecture] output:
- Core modules: jwt/, session/, oauth/
- Entry point: src/auth/index.ts
- Current test coverage: 12%
- Key finding: token refresh logic has zero test coverage

Based on the analysis above, write comprehensive unit tests.
```

#### UX Design: Explicit UI, No Slash Commands

**Entry point**: A new "Team Tasks" collapsible section in the sidebar (between Active Sessions and Scheduled Tasks):

```
┌─ Sidebar ─────────────────────┐
│ 🔵 Active Sessions (2)       │
│   ├ Session A     ● Running   │
│   └ Session B     ✓ Done      │
│                               │
│ 👥 Team Tasks (1)       [+]  │  ← NEW section
│   └ Refactor auth ● Running  │
│                               │
│ ⏰ Scheduled Tasks (3)       │
│   ...                         │
│                               │
│ 📁 Projects                  │
│   ...                         │
└───────────────────────────────┘
```

**4-phase view** (main content area switches to Orchestration Dashboard):

1. **Setup View** — Task description textarea + engine cards (multi-select with running status) + project selector + [Analyze Task] button
2. **Task Plan View** — Editable subtask cards from LLM decomposition (description, engine, worktree toggle, dependencies), add/remove subtasks + [Execute] button
3. **Execution Dashboard** — Subtask cards with dependency topology (showing Round progress), real-time status updates, each card has [View Session] to navigate to the child session chat view
4. **Result View** — LLM-aggregated summary + per-subtask results (collapsible) + worktree merge/delete actions

**Execution Dashboard example with DAG visualization**:

```
┌─────────────────────────────────────────────────────┐
│  👥 Refactor auth module        ● Running  Round 2/3│
├─────────────────────────────────────────────────────┤
│                                                     │
│  ┌─ 1. Analyze arch ─── Claude ──────────────────┐ │
│  │  ✅ Completed · 45s                            │ │
│  └──────────────────────┬────────────────────────┘ │
│                    ┌────┴────┐                      │
│                    ▼         ▼                      │
│  ┌─ 2. Write tests ─ Copilot┐ ┌─ 3. Refactor ── Claude ─┐│
│  │  🔵 Running · Bash       │ │  🔵 Running · Edit      ││
│  └──────────────┬───────────┘ └────────┬────────────┘│
│                 └────────┬─────────────┘              │
│                          ▼                            │
│  ┌─ 4. Verify integration ─── Claude ────────────┐  │
│  │  ○ Blocked · waiting for #2, #3               │  │
│  └───────────────────────────────────────────────┘  │
│                                                     │
│  [Cancel All]                                       │
└─────────────────────────────────────────────────────┘
```

**Child session navigation**: Child sessions appear naturally in the Active Sessions sidebar section. The child session titlebar shows a breadcrumb "← Back to Team Task" to return to the Dashboard.

#### Task Decomposition Strategy

- **LLM proposes + user confirms**: The Orchestrator sends a structured prompt; the LLM returns a JSON subtask array (with `dependsOn` dependencies)
- **Engine recommendation**: The prompt describes each engine's strengths; the LLM recommends engine assignments; users can freely change these in the confirmation UI
- **Same-engine multi-session**: A single engine can run multiple parallel sessions (each in its own worktree) — even single-engine setups benefit from orchestration

#### Data Model

```typescript
interface OrchestrationRun {
  id: string;
  parentSessionId: string;  // orchestrator session
  directory: string;
  status: "setup" | "decomposing" | "confirming" | "dispatching"
        | "running" | "aggregating" | "completed" | "failed" | "cancelled";
  prompt: string;
  engineTypes: EngineType[];
  subtasks: OrchestrationSubtask[];
  resultSummary?: string;
  createdAt: number;
  completedAt?: number;
}

interface OrchestrationSubtask {
  id: string;
  description: string;
  engineType: EngineType;
  dependsOn: string[];  // subtask IDs
  sessionId?: string;
  worktreeId?: string;
  needsWorktree: boolean;
  status: "blocked" | "pending" | "running" | "completed" | "failed";
  resultSummary?: string;
  error?: string;
  duration?: number;
  toolUses?: number;
}
```

#### Implementation Steps

| Step | File | Operation | Description |
|------|------|-----------|-------------|
| 1 | `src/types/unified.ts` | Modify | Add OrchestrationRun/Subtask types and Gateway message types |
| 2 | `electron/main/services/orchestrator-service.ts` | **New** | Core orchestration service: DAG execution, context injection, result aggregation |
| 3 | `electron/main/gateway/ws-server.ts` | Modify | Route orchestration requests + broadcast orchestration.updated events |
| 4 | `electron/main/index.ts` | Modify | Initialize OrchestratorService |
| 5 | `src/lib/gateway-api.ts` | Modify | Frontend API methods + notification handler |
| 6 | `src/stores/orchestration.ts` | **New** | Orchestration state store |
| 7 | `src/components/TeamTaskSection.tsx` | **New** | Sidebar Team Tasks section |
| 8 | `src/components/SessionSidebar.tsx` | Modify | Insert TeamTaskSection |
| 9 | `src/components/OrchestrationDashboard.tsx` | **New** | Main Dashboard component (4-phase views) |
| 10 | `src/pages/Chat.tsx` | Modify | View switching + child session breadcrumb navigation |

#### Reused Existing Infrastructure

- `WorktreeManager.create/remove/merge` — worktree lifecycle management
- `EngineManager.createSession/sendMessage` — session lifecycle management
- `ScheduledTaskService` permission auto-approve pattern — unattended subtask execution
- `computeActiveSessions` — child sessions automatically appear in Active Sessions
- `getEngineBadge()` / `StatusIndicator` — engine badges and status icon reuse

### Phase 2: Intelligence

- [ ] **Engine-Aware Routing** — Smart assignment of subtasks to the best-suited engine (e.g., Claude for reasoning-heavy tasks, Copilot for code generation)
- [ ] **Conflict Resolution** — Detect and resolve merge conflicts when multiple worktrees modify overlapping files
- [ ] **Retry & Fallback** — If one engine fails a subtask, auto-retry on a different engine
- [ ] **Cost Management** — Per-subtask token/cost tracking with budget controls

### Phase 3: Advanced

- [ ] **Human-in-the-Loop Checkpoints** — Pause orchestration at defined points for user review before proceeding
- [ ] **Team Presets** — User-defined orchestration templates (e.g., "Code Review Team" = Claude analyzes + Copilot suggests fixes + OpenCode runs tests)
- [ ] **Dynamic Re-planning** — Orchestrator can modify the DAG mid-execution based on intermediate results

## 🏗️ Why CodeMux Is Uniquely Positioned

| Existing Infrastructure | How It Enables Agent Team |
|------------------------|--------------------------|
| **EngineManager + multi-session routing** | Natural foundation for parallel session dispatch |
| **WorktreeManager** | Code isolation between parallel agents is already built |
| **Unified Type System** | Results from any engine already normalized — aggregation is straightforward |
| **WebSocket Gateway** | Real-time progress streaming to all clients (desktop, browser, IM) |
| **IM Channel Adapters** | Agent Team results accessible from Feishu/DingTalk/Telegram |
| **ScheduledTaskService** | Permission auto-approve pattern reusable for unattended subtask execution |
| **Permission/Question System** | Supports human-in-the-loop approval during orchestration |

## 🤔 Design Decisions (Resolved)

| Question | Decision |
|----------|----------|
| Orchestrator implementation | `OrchestratorService` as a layer above EngineManager, using LLM for task decomposition |
| Task decomposition strategy | LLM-driven decomposition with user confirmation/editing in UI |
| Scope of first iteration | Jump directly to cross-engine orchestration (same-engine internal parallelism deferred to Agent Team) |
| Inter-session communication | Hub-and-Spoke: Orchestrator extracts upstream results → injects into downstream prompts |
| Execution model | DAG-based with multi-round iteration, not simple parallel-then-aggregate |
| UX approach | Explicit UI (Sidebar section + Dashboard), no slash commands |

## 📎 Related

- Phase 0 PR: #99 (merged)
- Roadmap: #71 (specifically "Engine Duel Mode" and "Subtask Hierarchy Visualization")
- Existing worktree management in `electron/main/services/worktree-manager.ts`
- Engine adapter pattern in `electron/main/engines/`

---

> This feature would make CodeMux the first multi-engine AI coding client with **cross-engine collaborative agent workflows** — a capability that no single-engine tool (Claude Code, Copilot, Cursor, etc.) can offer on its own.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🤖 Cross-Engine Agent Team — Orchestrate Multi-Engine Collaborative Workflows #90

💡 Vision

🎯 Core Concept

🧭 Claude Agent Team vs CodeMux Orchestration

✨ Roadmap

Phase 0: Sub-agent Visibility ✅ (PR #99, merged)

Phase 1: Cross-Engine Task Orchestration 🚧

Core Architecture: Hub-and-Spoke

Execution Model: DAG + Multi-Round Iteration

UX Design: Explicit UI, No Slash Commands

Task Decomposition Strategy

Data Model

Implementation Steps

Reused Existing Infrastructure

Phase 2: Intelligence

Phase 3: Advanced

🏗️ Why CodeMux Is Uniquely Positioned

🤔 Design Decisions (Resolved)

📎 Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

	Claude Agent Team	CodeMux Orchestration
Orchestration layer	Inside Claude session	CodeMux application layer
Engine scope	Claude only	Cross-engine (Claude + Copilot + OpenCode)
Context	Sub-agents share parent session context	Sessions are fully independent, Orchestrator injects context
Isolation	Optional worktree	Each subtask gets an independent worktree
User control	Claude decides internally	Users review and edit the decomposition plan in UI

Step	File	Operation	Description
1	`src/types/unified.ts`	Modify	Add OrchestrationRun/Subtask types and Gateway message types
2	`electron/main/services/orchestrator-service.ts`	New	Core orchestration service: DAG execution, context injection, result aggregation
3	`electron/main/gateway/ws-server.ts`	Modify	Route orchestration requests + broadcast orchestration.updated events
4	`electron/main/index.ts`	Modify	Initialize OrchestratorService
5	`src/lib/gateway-api.ts`	Modify	Frontend API methods + notification handler
6	`src/stores/orchestration.ts`	New	Orchestration state store
7	`src/components/TeamTaskSection.tsx`	New	Sidebar Team Tasks section
8	`src/components/SessionSidebar.tsx`	Modify	Insert TeamTaskSection
9	`src/components/OrchestrationDashboard.tsx`	New	Main Dashboard component (4-phase views)
10	`src/pages/Chat.tsx`	Modify	View switching + child session breadcrumb navigation

Existing Infrastructure	How It Enables Agent Team
EngineManager + multi-session routing	Natural foundation for parallel session dispatch
WorktreeManager	Code isolation between parallel agents is already built
Unified Type System	Results from any engine already normalized — aggregation is straightforward
WebSocket Gateway	Real-time progress streaming to all clients (desktop, browser, IM)
IM Channel Adapters	Agent Team results accessible from Feishu/DingTalk/Telegram
ScheduledTaskService	Permission auto-approve pattern reusable for unattended subtask execution
Permission/Question System	Supports human-in-the-loop approval during orchestration

Question	Decision
Orchestrator implementation	`OrchestratorService` as a layer above EngineManager, using LLM for task decomposition
Task decomposition strategy	LLM-driven decomposition with user confirmation/editing in UI
Scope of first iteration	Jump directly to cross-engine orchestration (same-engine internal parallelism deferred to Agent Team)
Inter-session communication	Hub-and-Spoke: Orchestrator extracts upstream results → injects into downstream prompts
Execution model	DAG-based with multi-round iteration, not simple parallel-then-aggregate
UX approach	Explicit UI (Sidebar section + Dashboard), no slash commands

🤖 Cross-Engine Agent Team — Orchestrate Multi-Engine Collaborative Workflows #90

Description

💡 Vision

🎯 Core Concept

🧭 Claude Agent Team vs CodeMux Orchestration

✨ Roadmap

Phase 0: Sub-agent Visibility ✅ (PR #99, merged)

Phase 1: Cross-Engine Task Orchestration 🚧

Core Architecture: Hub-and-Spoke

Execution Model: DAG + Multi-Round Iteration

UX Design: Explicit UI, No Slash Commands

Task Decomposition Strategy

Data Model

Implementation Steps

Reused Existing Infrastructure

Phase 2: Intelligence

Phase 3: Advanced

🏗️ Why CodeMux Is Uniquely Positioned

🤔 Design Decisions (Resolved)

📎 Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions