Give your AI agent the ability to record its screen and share video via Mux.
An MCP server that creates narrated screen recordings using a two-pass approach:
- Research pass - Visits each page, analyzes the content, and generates contextual narration using Claude
- Performance pass - Records smooth scroll animations timed to the generated audio
- Post-production - Merges audio with video and uploads to Mux
The narration is based on what the tool actually sees on each page, so commentary is always relevant and contextual.
Use cases:
- Proof of work - agent shows exactly what it did
- Narrated demos - agent explores your product with commentary
- Persona-based reviews - "roast mode", "interested prospect", "caveman", etc.
- Video bug reports - agent reproduces and records issues
- Async handoffs - agent records context for humans
cd mcp-server
npm install
npx playwright install chromiumCreate a .env file in the project root:
ANTHROPIC_API_KEY="your-api-key"
ELEVENLABS_API_KEY="your-api-key"
MUX_TOKEN_ID="your-token-id"
MUX_TOKEN_SECRET="your-token-secret"Get credentials from:
- Anthropic Console > API Keys (for narration generation)
- ElevenLabs > API Keys (for text-to-speech)
- Mux Dashboard > Settings > API Access Tokens (for video hosting)
Add this MCP server to your Claude Code configuration. Edit ~/.claude/settings.json (global) or .claude/settings.local.json (project):
{
"mcpServers": {
"narrator": {
"command": "node",
"args": ["/path/to/agent-video/mcp-server/index.js"],
"env": {}
}
}
}Replace /path/to/agent-video with the actual path to this project.
mkdir -p ~/Movies/agent-recordingsCreates a narrated screen recording of web pages.
-
persona(string, required): The character/style for narration. Can be anything you describe:- "a sarcastic tech reviewer who's seen it all"
- "Gordon Ramsay reviewing websites"
- "a confused grandparent trying to understand the internet"
- "an overenthusiastic startup founder"
-
pages(array, required): Pages to visiturl(string, required): The URL to visitnarration(string, optional): Custom narration. If omitted, auto-generated based on page content.
{
"persona": "a jaded Silicon Valley investor who's seen a thousand pitch decks",
"pages": [
{ "url": "https://example.com" },
{ "url": "https://example.com/about" },
{ "url": "https://example.com/pricing" }
]
}{
"persona": "documentary narrator",
"pages": [
{
"url": "https://example.com",
"narration": "Here we observe the landing page in its natural habitat."
}
]
}{
"success": true,
"playbackUrl": "https://stream.mux.com/abc123",
"sessionDir": "/Users/.../session-123456",
"pagesRecorded": 3
}- Research pass: Opens browser, visits each page, takes snapshots
- Narration generation: Sends snapshots to Claude API to generate contextual narration in the specified persona
- Audio generation: Converts narration to speech via ElevenLabs
- Performance pass: Opens browser again, records smooth scrolling timed to audio duration
- Post-production: Extracts segments, merges audio with precise timing via ffmpeg
- Upload: Sends final video to Mux, returns playback URL
For a single recording you watch yourself, Mux isn't necessary. But Mux adds value when:
- Sharing with others - instant playback URL that works everywhere
- Analytics - know if the recipient watched, how much, what they rewatched
- Scale - managing many recordings over time
- Professional delivery - adaptive streaming, works on any device
The agent doesn't care about Mux. But the human receiving the video gets a polished, trackable experience.