Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
270 changes: 270 additions & 0 deletions .github/workflows/visual-tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,270 @@
name: Visual Tests

on:
# Manual trigger with options
workflow_dispatch:
inputs:
run_claude_validation:
description: 'Run Claude vision validation'
required: false
default: true
type: boolean
python_version:
description: 'Python version'
required: false
default: '3.11'
type: string

# Also run on PRs that touch overlay code
pull_request:
paths:
- 'sage/overlay.py'
- 'scripts/visual_test_overlay.py'
- 'scripts/validate_screenshots.py'
- 'tests/**/test_overlay*.py'

jobs:
visual-test:
name: Capture & Validate Screenshots
runs-on: ubuntu-latest

steps:
- name: Checkout code
uses: actions/checkout@v4

- name: Set up Python ${{ inputs.python_version || '3.11' }}
uses: actions/setup-python@v5
with:
python-version: ${{ inputs.python_version || '3.11' }}
cache: 'pip'

- name: Install system dependencies
run: |
sudo apt-get update
sudo apt-get install -y \
libdbus-1-dev \
libxcb-cursor0 \
libxcb-icccm4 \
libxcb-image0 \
libxcb-keysyms1 \
libxcb-randr0 \
libxcb-render-util0 \
libxcb-shape0 \
libxcb-xinerama0 \
libxcb-xfixes0 \
libxkbcommon-x11-0 \
x11-utils \
xvfb \
libegl1 \
libgl1 \
libglib2.0-0 \
scrot \
imagemagick

- name: Install Python dependencies
run: |
python -m pip install --upgrade pip
# Install without dbus extra to avoid build issues in CI
pip install pydantic pyyaml PySide6 watchdog
pip install pytest pytest-cov pytest-qt ruff mypy types-PyYAML
pip install anthropic
pip install -e . --no-deps

- name: Run visual tests (capture screenshots)
id: capture
run: |
# Create screenshots directory
mkdir -p screenshots

# Run visual test script under xvfb with xcb platform
xvfb-run -a --server-args="-screen 0 1920x1080x24" \
python scripts/visual_test_overlay.py

# List captured screenshots
echo "Captured screenshots:"
ls -la screenshots/

# Count screenshots for summary
SCREENSHOT_COUNT=$(ls -1 screenshots/overlay_test_*.png 2>/dev/null | wc -l)
echo "screenshot_count=$SCREENSHOT_COUNT" >> $GITHUB_OUTPUT

# Require all 5 screenshots
if [ "$SCREENSHOT_COUNT" -lt 5 ]; then
echo "::error::Expected 5 screenshots, only captured $SCREENSHOT_COUNT"
exit 1
fi
env:
QT_QPA_PLATFORM: xcb
DISPLAY: ':99'

- name: Generate screenshot montage
if: success()
run: |
# Create a montage of all screenshots for easy review
if ls screenshots/overlay_test_*.png 1> /dev/null 2>&1; then
montage screenshots/overlay_test_*.png -tile 2x3 -geometry +5+5 \
-background '#1a1a1a' -title 'Overlay Visual Tests' \
screenshots/montage.png || echo "Montage creation skipped"
fi

- name: Validate screenshots with Claude
id: validate
if: ${{ (github.event_name == 'workflow_dispatch' && inputs.run_claude_validation == true) || (github.event_name == 'pull_request' && secrets.ANTHROPIC_API_KEY != '') }}
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition checking secrets.ANTHROPIC_API_KEY != '' will always evaluate to false when the secret doesn't exist, as GitHub Actions treats non-existent secrets as empty strings. However, the expression secrets.ANTHROPIC_API_KEY != '' in a conditional context actually checks if the secret has a non-empty value, which is the intended behavior. This is correct, but for clarity and better practice, consider using secrets.ANTHROPIC_API_KEY alone, which evaluates to true if the secret exists and has a value.

Suggested change
if: ${{ (github.event_name == 'workflow_dispatch' && inputs.run_claude_validation == true) || (github.event_name == 'pull_request' && secrets.ANTHROPIC_API_KEY != '') }}
if: ${{ (github.event_name == 'workflow_dispatch' && inputs.run_claude_validation == true) || (github.event_name == 'pull_request' && secrets.ANTHROPIC_API_KEY) }}

Copilot uses AI. Check for mistakes.
run: |
Comment on lines +110 to +113
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation step’s if: condition relies on inputs.run_claude_validation == '' to mean “default enabled”, but on pull_request runs inputs.* is not provided (it evaluates as null/undefined, not an empty string), so this condition can evaluate to false and skip validation even when the API key is configured. Consider making the default explicit for non-workflow_dispatch events (e.g., github.event_name != 'workflow_dispatch' || inputs.run_claude_validation) and check the secret directly (secrets.ANTHROPIC_API_KEY != '') instead of piping through env.

Copilot uses AI. Check for mistakes.
Comment on lines +110 to +113
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. Validation skipped on prs 🐞 Bug ✓ Correctness

The Claude validation step is gated on inputs.run_claude_validation, but inputs is only defined
for workflow_dispatch, so on pull_request runs the condition evaluates false and validation is
skipped even when an API key is configured.
Agent Prompt
### Issue description
Claude validation is unintentionally skipped on `pull_request` runs because the step condition depends on `inputs.run_claude_validation`, which is only populated for `workflow_dispatch`. This defeats the stated goal of running visual validation on PRs.

### Issue Context
- The workflow triggers on `pull_request`.
- The validation step’s `if:` condition uses `inputs.run_claude_validation == true || inputs.run_claude_validation == ''`.

### Fix Focus Areas
- .github/workflows/visual-tests.yml[3-25]
- .github/workflows/visual-tests.yml[100-130]

### Suggested change (direction)
- Update the validation step `if:` to something like:
  - `if: ${{ (github.event_name != 'workflow_dispatch' || inputs.run_claude_validation) && secrets.ANTHROPIC_API_KEY != '' }}`
- Update the “Skip validation notice” step condition to mirror the new logic (skip when validation disabled on dispatch OR when the secret is missing).
- Prefer checking `secrets.ANTHROPIC_API_KEY != ''` in the `if:` rather than relying on step-local env variables.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

echo "Running Claude vision validation..."

# Run validation and capture exit code
set +e
python scripts/validate_screenshots.py screenshots/ \
--output screenshots/validation_report.json
VALIDATION_EXIT_CODE=$?
set -e

# Store result for later steps
if [ $VALIDATION_EXIT_CODE -eq 0 ]; then
echo "validation_passed=true" >> $GITHUB_OUTPUT
echo "✅ All visual validations passed!"
elif [ $VALIDATION_EXIT_CODE -eq 1 ]; then
echo "validation_passed=false" >> $GITHUB_OUTPUT
echo "❌ Some visual validations failed"
else
echo "validation_passed=error" >> $GITHUB_OUTPUT
echo "⚠️ Validation encountered an error"
fi

exit 0 # Don't fail here, we'll check in a later step
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}

- name: Skip validation notice
if: ${{ !((github.event_name == 'workflow_dispatch' && inputs.run_claude_validation == true) || (github.event_name == 'pull_request' && secrets.ANTHROPIC_API_KEY != '')) }}
run: |
Comment on lines +110 to +141
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

3. Has_api_key used in if 🐞 Bug ⛯ Reliability

The workflow’s if: conditions reference env.HAS_API_KEY, but HAS_API_KEY is only defined
inside step env: blocks; this is fragile and can cause unexpected skipping depending on evaluation
timing. Gate directly on secrets.ANTHROPIC_API_KEY != '' or set HAS_API_KEY at job level.
Agent Prompt
### Issue description
`env.HAS_API_KEY` is referenced in `if:` expressions, but `HAS_API_KEY` is only defined inside the same step’s `env:` block. This is fragile and can lead to the validation step unexpectedly skipping.

### Issue Context
The intention is: run validation only when an API key exists (and optionally when enabled by workflow input).

### Fix Focus Areas
- .github/workflows/visual-tests.yml[100-131]

### Suggested change (direction)
- Prefer:
  - `if: ${{ secrets.ANTHROPIC_API_KEY != '' && (github.event_name != 'workflow_dispatch' || inputs.run_claude_validation) }}`
- If you still want `HAS_API_KEY`, define it at the job level (`jobs.visual-test.env`) so it is available in `if:` contexts consistently.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

echo "⚠️ Claude validation skipped"
echo ""
echo "To enable automated visual validation:"
echo "1. Go to Settings → Secrets and variables → Actions"
echo "2. Click 'New repository secret'"
echo "3. Name: ANTHROPIC_API_KEY"
echo "4. Value: Your Anthropic API key"

- name: Upload screenshots
if: always()
uses: actions/upload-artifact@v4
with:
name: overlay-screenshots-${{ github.sha }}
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The artifact name uses ${{ github.sha }} which will create a very long artifact name (e.g., "overlay-screenshots-abc123def456..."). GitHub Actions allows artifact names up to 256 characters, so this should work, but consider using a shorter suffix like ${{ github.run_number }} or ${{ github.run_id }} for better readability in the artifacts list, or simply "overlay-screenshots" if uniqueness per run isn't critical (newer uploads will overwrite older ones with the same name).

Suggested change
name: overlay-screenshots-${{ github.sha }}
name: overlay-screenshots-${{ github.run_number }}

Copilot uses AI. Check for mistakes.
path: screenshots/
retention-days: 30

- name: Generate summary
if: always()
run: |
echo "## 🖼️ Visual Test Results" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY

# Screenshot capture results
echo "### 📸 Screenshot Capture" >> $GITHUB_STEP_SUMMARY
echo "**Screenshots captured:** ${{ steps.capture.outputs.screenshot_count }}" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY

# Validation results
echo "### 🤖 Claude Vision Validation" >> $GITHUB_STEP_SUMMARY
if [ -f screenshots/validation_report.json ]; then
# Parse and display results nicely
PASSED=$(cat screenshots/validation_report.json | python3 -c "import sys,json; d=json.load(sys.stdin); print(d['summary']['passed'])")
FAILED=$(cat screenshots/validation_report.json | python3 -c "import sys,json; d=json.load(sys.stdin); print(d['summary']['failed'])")
RATE=$(cat screenshots/validation_report.json | python3 -c "import sys,json; d=json.load(sys.stdin); print(d['summary']['pass_rate'])")

if [ "$FAILED" = "0" ]; then
echo "✅ **All validations passed!**" >> $GITHUB_STEP_SUMMARY
else
echo "❌ **Some validations failed**" >> $GITHUB_STEP_SUMMARY
fi
echo "" >> $GITHUB_STEP_SUMMARY
echo "| Metric | Value |" >> $GITHUB_STEP_SUMMARY
echo "|--------|-------|" >> $GITHUB_STEP_SUMMARY
echo "| Passed | $PASSED |" >> $GITHUB_STEP_SUMMARY
echo "| Failed | $FAILED |" >> $GITHUB_STEP_SUMMARY
echo "| Pass Rate | $RATE |" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY

echo "<details><summary>Full Validation Report</summary>" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo '```json' >> $GITHUB_STEP_SUMMARY
cat screenshots/validation_report.json >> $GITHUB_STEP_SUMMARY
echo '```' >> $GITHUB_STEP_SUMMARY
echo "</details>" >> $GITHUB_STEP_SUMMARY
else
echo "⚠️ Validation was skipped (ANTHROPIC_API_KEY not configured)" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "To enable:" >> $GITHUB_STEP_SUMMARY
echo "1. Go to **Settings → Secrets → Actions**" >> $GITHUB_STEP_SUMMARY
echo "2. Add secret: \`ANTHROPIC_API_KEY\`" >> $GITHUB_STEP_SUMMARY
fi
echo "" >> $GITHUB_STEP_SUMMARY

# Test scenarios table
echo "### 📋 Test Scenarios" >> $GITHUB_STEP_SUMMARY
echo "| Test | Description |" >> $GITHUB_STEP_SUMMARY
echo "|------|-------------|" >> $GITHUB_STEP_SUMMARY
echo "| 01_empty | Empty overlay (no suggestions) |" >> $GITHUB_STEP_SUMMARY
echo "| 02_suggestions | Demo suggestions displayed |" >> $GITHUB_STEP_SUMMARY
echo "| 03_single | Single suggestion chip |" >> $GITHUB_STEP_SUMMARY
echo "| 04_max_three | Maximum 3 suggestions (from 4 input) |" >> $GITHUB_STEP_SUMMARY
echo "| 05_cleared | Suggestions cleared |" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "📦 Download the **overlay-screenshots** artifact to review images manually." >> $GITHUB_STEP_SUMMARY

- name: Check validation result
if: ${{ steps.validate.outputs.validation_passed == 'false' }}
run: |
echo "❌ Claude validation found issues with the screenshots"
echo "Check the validation report in the artifacts for details"
exit 1

# Run standard overlay unit tests
overlay-unit-tests:
name: Overlay Unit Tests
runs-on: ubuntu-latest

steps:
- name: Checkout code
uses: actions/checkout@v4

- name: Set up Python 3.11
uses: actions/setup-python@v5
with:
python-version: '3.11'
cache: 'pip'

- name: Install system dependencies
run: |
sudo apt-get update
sudo apt-get install -y \
libdbus-1-dev \
libxcb-cursor0 \
libxcb-icccm4 \
libxcb-image0 \
libxcb-keysyms1 \
libxcb-randr0 \
libxcb-render-util0 \
libxcb-shape0 \
libxcb-xinerama0 \
libxcb-xfixes0 \
libxkbcommon-x11-0 \
xvfb \
libegl1 \
libgl1

- name: Install Python dependencies
run: |
python -m pip install --upgrade pip
# Install without dbus extra to avoid build issues
pip install pydantic pyyaml PySide6 watchdog
pip install pytest pytest-cov pytest-qt ruff mypy types-PyYAML
pip install -e . --no-deps

- name: Run overlay tests
run: |
xvfb-run -a pytest tests/unit/test_overlay.py tests/e2e/test_overlay_signal.py -v --no-cov
env:
QT_QPA_PLATFORM: xcb
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -62,3 +62,6 @@ config/*.local.yaml

# DBus temp files
*.pyc

# Screenshots (generated by visual tests)
screenshots/
49 changes: 48 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
> Context-aware keyboard shortcut suggestions for KDE Plasma (Wayland)

[![CI](https://github.com/Coldaine/ShortcutSage/actions/workflows/ci.yml/badge.svg)](https://github.com/Coldaine/ShortcutSage/actions/workflows/ci.yml)
[![Visual Tests](https://github.com/Coldaine/ShortcutSage/actions/workflows/visual-tests.yml/badge.svg)](https://github.com/Coldaine/ShortcutSage/actions/workflows/visual-tests.yml)
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![Code style: ruff](https://img.shields.io/badge/code%20style-ruff-000000.svg)](https://github.com/astral-sh/ruff)

Expand Down Expand Up @@ -107,9 +108,48 @@ shortcut-sage overlay --demo
- Listens for DBus `Suggestions` signals from the daemon; `--demo` fills placeholder data without DBus
- Honors `Qt.WindowDoesNotAcceptFocus` so it never steals focus while you work

## Automated Visual Testing

The overlay UI is validated using automated screenshot testing with Claude vision:

- **GitHub Actions**: Captures screenshots under xvfb in CI
- **Claude Vision**: Validates screenshots against specific criteria
- **5 test scenarios**: Empty, 2 suggestions, single, max 3 (truncation), cleared
- **Artifacts**: Screenshots available for manual review

See [docs/plans/visual-test-checklist.md](docs/plans/visual-test-checklist.md) for details.

## Development

### Running Tests
### Quick Commands (using justfile)

```bash
# Setup environment
just setup

# Run all tests
just test

# Run tests without Qt/DBus (headless CI compatible)
just test-headless

# Run visual tests (requires graphical environment)
just test-visual

# Lint and format
just lint
just format

# Simulate CI locally
just ci

# Run daemon/overlay
just daemon
just overlay
just demo
```

### Running Tests Manually

```bash
# All tests with coverage
Expand All @@ -123,6 +163,13 @@ pytest tests/integration

# End-to-end tests (requires KDE)
pytest tests/e2e

# Visual tests (captures screenshots)
python scripts/visual_test_overlay.py

# Validate screenshots with Claude
export ANTHROPIC_API_KEY='your-key'
python scripts/validate_screenshots.py screenshots/
```

### Code Quality
Expand Down
Loading
Loading