Skip to content

Selenium CLI tool #17326

@AutomatedTester

Description

@AutomatedTester

Product Requirements Document: Sel-CLI (v1.2 - 2026 Edition)

Project Focus: Hybrid Rust/TypeScript CLI for Agentic & Industrial-Grade Browser Automation.


1. Competitive Analysis & Gap Summary

  • Playwright (2026): Dominates with the Speedboard Timeline (visualizing slow test bottlenecks) and AI Test Agents (autonomous healer loops).
  • Cypress (2026): Leading in Quality Intelligence (AI identifying untested code gaps) and simplified Studio recording.
  • Selenium (2026): Moving to WebDriver BiDi, allowing two-way communication (streaming console/network events natively).
  • Sel-CLI Advantage: Uses Rust (Selenium Manager core) for raw speed and TypeScript (Oclif) for the best AI-agent integration (MCP support), while supporting legacy and cross-browser grids that competitors ignore.

2. Updated Functional Requirements

Module 1: Infrastructure & Performance (Rust Core)

ID Feature Description Priority
FR-1.1 Universal Binary Single Rust binary that replaces the need for local Java or Python environments. P0
FR-1.2 Install Browsers and Grid Download browsers and selenium grid and get them ready to run. P0
FR-1.3 Speedboard Logs Native profiling of every Selenium command to generate a Timeline of "Time to Interactive" (Playwright parity). P1
FR-1.4 BiDi Streamer Direct pipe to browser console and network events using WebDriver BiDi for real-time debugging. P1

Module 2: Agentic Automation (AI-First)

ID Feature Description Priority
FR-2.1 MCP Server Support Integration with Model Context Protocol (MCP) to allow Cursor/Claude to control browsers natively. P1
FR-2.2 Healer Loop If a test fails due to a changed locator, the CLI autonomously attempts to find a new locator and suggests a PR. P1
FR-2.3 YAML Snapshots Converts DOM to ultra-compact YAML. Crucial for reducing LLM token costs by 70% (Firecrawl/Agent-native). P1

Module 3: Advanced Observability

ID Feature Description Priority
FR-3.1 Trace-Travel UI A visual dashboard where you can scroll through every step of a test and see the DOM state (Time-traveling). P1
FR-3.2 Shadow-Piercer Automatically detects and interacts with Shadow DOMs and Iframes without manual "context switching." P1

3. Technical Architecture Update

The "Core-to-Shell" Model

  • Kernel (Rust): Extends selenium-manager. Handles browser/driver installation, YAML snapshotting logic, and high-speed BiDi event processing.
  • Shell (TypeScript/Oclif): Handles the command-line interface, user prompts, and formatting JSON/YAML for AI agents.
  • Bridge: Use a high-performance IPC (Inter-Process Communication) or shared library bindings to ensure zero latency between the TS CLI and the Rust automation core.

4. New "Google Workspace CLI" Style Command Patterns

Inspired by the gws (Google Workspace CLI) patterns of 2026:

  • Command: sel list sessions --params '{"status": "active"}'
  • Command: sel run ./tests --agent --json (Outputs purely machine-readable context for AI).
  • Command: sel install --skills (Installs a specific capability profile like "Mobile Safari" or "Legacy IE11").

5. Deployment & Distribution

  1. Direct Installer: curl -fsSL https://sel.cli.dev/install.sh | bash.
  2. NPM Wrapper: npm install -g @sel-cli/core.
    • Note: The NPM package will detect the OS and download the specific Rust binary for that platform.
  3. Docker-Ready: Native binary supports Alpine/Debian-slim for lightweight CI/CD runners.

6. User Journey: The AI Agent Flow

  1. AI: "I need to automate the login for this site."
  2. CLI: Agent runs sel open https://example.com --agent.
  3. CLI: Returns a YAML Snapshot (e.g., button @e1: 'Login').
  4. AI: Commands sel click @e1.
  5. CLI: Executes via Rust-core, verifies success via BiDi event, and returns a confirmation.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions