Skip to content

rexchengm/maestro

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Maestro

Stronger hands for your AI — the RPA it can actually drive. 给 AI 装上更强的手——它真正驾驭得了的 RPA。

platform python mcp license PRs

English | 中文


Most RPA tools can't let an AI drive them. You hand-build every step, hand-write the glue code, and the "automation" still can't adapt itself — anyone who's used one knows the pain. Maestro flips it: you demonstrate a task once, and from then on an AI takes over. It can optimize the flow, combine several flows, pull out a reusable sub-flow, and compose the pieces into a full automation program. Writing automations stops being manual labor.

Under the hood it's an MCP server, so any MCP-capable AI gets real hands on your computer — it can see the screen, drive your real apps and browser, and build new automations on its own. And because what it runs are deterministic, pre-recorded/composed workflows — not an agent improvising clicks live — it stays controllable and repeatable. The smarter the AI gets, the stronger Maestro gets.

How it's different

  • vs. traditional RPA — there a human hand-builds and hand-codes every step. Here you record once and AI authors the rest: optimize / combine / extract / compose.
  • vs. free-roaming AI agents — those improvise live and wander off-task. Maestro runs deterministic workflows you or the AI defined — controllable, repeatable, auditable.
  • vs. cloud record-and-replay — those are sandboxed, subscription-gated, region-locked. Maestro is local, open, and drives your real, logged-in apps.

The loop

  1. Demonstrate once. Do the task by hand; Maestro captures it as a flow, anchored on on-screen text/visuals rather than brittle coordinates.
  2. Hand it to an AI. Over MCP the AI can optimize the flow, combine flows, extract a reusable sub-flow, and compose them into a larger automation — no hand-coding.
  3. Run it deterministically. The composed workflow replays exactly, as often as you want.

Every step is editable on an infinite node-graph canvas; reusable blocks and sub-flows snap together like Lego. Nothing is a black box.

What you can build

The author runs Maestro to:

  • Auto-submit job applications — read each posting, score fit, draft a tailored opener, send on approval.
  • Auto-reply to messages — context-aware replies from per-contact memory, with a whitelist and cooldowns.
  • Analyze content-platform data — OCR the numbers straight off the screen, no API needed.
  • Auto-publish posts — and other repetitive desktop work.

All on one primitive: record once, let AI compose the rest.

Powerful — so, safety

Maestro can fully control your computer and browser. Two things keep that in check:

  • It runs deterministic workflows, not an improvising agent — it does exactly what was recorded/composed, nothing decided on a whim.
  • Human-in-the-loop on anything outward — sends are approval-gated by default, the auto-reply whitelist ships empty, group/broadcast messages are skipped. While a flow runs, keep hands off the keyboard/mouse; emergency-stop by flinging the cursor to the top-right corner or pressing ESC.

Run it on your own accounts, and follow the terms of any service you automate.

Under the hood

  • Pure-vision engine — full-screen capture → on-device OCR → mouse & keyboard. No app APIs, no accessibility-tree hacks; if you can see it, Maestro can drive it.
  • MCP-native — the whole engine is exposed as MCP tools, so any compatible AI can see, act, record, and compose.
  • Node-graph + blocks — atomic actions, reusable blocks, and sub-flows compose on an infinite canvas, with conditions and loops.

Quick start

Requires macOS, Python 3.12+, and Node 18+.

# 1) Backend (the engine) — http://127.0.0.1:8000
cd backend
python3 -m venv .venv && .venv/bin/pip install -r requirements.txt
.venv/bin/python -m uvicorn app.main:app --port 8000

# 2) Frontend (the studio) — http://127.0.0.1:5173
cd frontend
npm install && npm run dev

Then grant the terminal Screen Recording + Accessibility in System Settings → Privacy & Security, open http://127.0.0.1:5173, and record your first workflow.

License

Apache License 2.0 — free to use, modify, and distribute, including commercially. See LICENSE. Copyright © 2026 rexchengm.

Contributing

Issues and PRs welcome. By contributing, you agree your work is licensed under Apache-2.0.


中文

给 AI 装上更强的手——它真正驾驭得了的 RPA。

大多数 RPA 工具,都没法让 AI 自己驱动:每一步要你手搭、胶水代码要你手写,搭出来的"自动化"还不能自我调整——用过的人都懂这份难受。Maestro 把它反过来:你只演示一遍,之后就交给 AI 接管。 AI 能优化流程、组合多条流程、抽出可复用的子流程,再把这些拼成一个完整的自动化程序。写自动化,不再是体力活。

它本质是一个 MCP 服务,所以任何支持 MCP 的 AI 都能拿到一双能控制你电脑的真手——看屏、驱动你真实的应用与浏览器、自己造新自动化。而它跑的是确定性的、预先录制/拼装好的工作流——不是一个临场乱点的智能体——所以可控、可复现。AI 越强,Maestro 越强。

和别的有什么不一样

  • 对比传统 RPA——那边每一步要人手搭手写。这边你录一遍,剩下交给 AI 写:优化 / 组合 / 抽取 / 拼装。
  • 对比放养型 AI 智能体——那种临场发挥、容易跑偏。Maestro 跑的是你或 AI 定好的确定性工作流——可控、可复现、可审计。
  • 对比云端录制回放——那些是沙箱、要订阅、分地区。Maestro 本地、开源,驱动你真实的登录态应用

闭环

  1. 演示一遍。 你手动做一次,Maestro 把它录成一条流程,锚在屏幕文字/视觉上,而非脆弱的坐标。
  2. 交给 AI。 通过 MCP,AI 能优化流程、组合多条、抽出可复用子流程,再拼成更大的自动化——不用手写代码。
  3. 确定性执行。 拼好的工作流原样回放,想跑多少遍跑多少遍。

每一步都能在无限节点画布上编辑;可复用的子流程像乐高一样拼。没有黑盒。

你能用它做什么

作者本人用 Maestro:

  • 自动投递简历——逐条读岗位、评估匹配、写定制开场白、批准后发送。
  • 自动回复消息——基于按联系人记忆的上下文回复,带白名单和冷却。
  • 分析内容平台数据——直接 OCR 读屏取数,不需要 API。
  • 自动发帖——以及其它重复的桌面活。

全都建立在同一个原语上:录一遍,剩下让 AI 拼。

很强——所以要讲安全

Maestro 能完全控制你的电脑和浏览器。两点让它可控:

  • 跑的是确定性工作流,不是临场发挥的智能体——它只做录好/拼好的事,不会一时兴起乱来。
  • 对外动作人始终在环——发送默认需批准,自动回复白名单出厂为空,群聊/广播跳过。流程运行时请勿碰键鼠;紧急停止:光标甩到右上角或按 ESC

请在你自己的账号上使用,并遵守你所自动化的各服务条款。

引擎原理

  • 纯视觉引擎——全屏截图 → 本机 OCR → 键鼠操作。不调应用 API、不靠辅助功能树;你看得见的它就能操作。
  • MCP 原生——整个引擎暴露成 MCP 工具,任何兼容 AI 都能看、能动、能录、能拼。
  • 节点图 + 块——原子动作、可复用块、子流程在无限画布上拼装,支持条件与循环。

快速开始

需要 macOS、Python 3.12+、Node 18+。

# 1) 后端(引擎)—— http://127.0.0.1:8000
cd backend
python3 -m venv .venv && .venv/bin/pip install -r requirements.txt
.venv/bin/python -m uvicorn app.main:app --port 8000

# 2) 前端(工作台)—— http://127.0.0.1:5173
cd frontend
npm install && npm run dev

然后在 系统设置 → 隐私与安全性 给终端授予屏幕录制 + 辅助功能权限,打开 http://127.0.0.1:5173,录下你的第一个流程。

许可证

Apache License 2.0 —— 可自由使用、修改、分发(含商用)。见 LICENSE。Copyright © 2026 rexchengm。

参与贡献

欢迎 Issue 与 PR。贡献即表示同意你的代码在 Apache-2.0 下分发。

About

Stronger hands for your AI — the RPA it can actually drive. Record a task once; AI optimizes, combines, and composes it into deterministic automations that control your real apps & browser. Pure-vision, local, MCP-native. The smarter the AI, the stronger it gets.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors