Skip to content

[Leo] Fix polybench-segmented.yml — correct ELF filenames and benchmarks (Issue #499) #15

[Leo] Fix polybench-segmented.yml — correct ELF filenames and benchmarks (Issue #499)

[Leo] Fix polybench-segmented.yml — correct ELF filenames and benchmarks (Issue #499) #15

name: Performance Regression Monitoring
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
schedule:
# Run nightly performance monitoring
- cron: '0 6 * * *' # 6 AM UTC daily
jobs:
performance-baseline:
name: Performance Baseline Monitoring
runs-on: ubuntu-latest
timeout-minutes: 20
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
# Get enough history for comparison
fetch-depth: 100
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version: '1.25'
- name: Create results directory
run: mkdir -p performance-results
- name: Run performance validation
id: validation
run: |
# Install Python dependencies if needed
python3 -m pip install --upgrade pip
# Run the performance validation script
cd ${{ github.workspace }}
python3 scripts/performance_optimization_validation.py > performance-results/validation-output.txt 2>&1 || true
# Extract key metrics for comparison
echo "benchmark_results<<EOF" >> $GITHUB_OUTPUT
if [ -f "performance-results/validation-output.txt" ]; then
grep -E "ns/op|allocs/op" performance-results/validation-output.txt | head -10 >> $GITHUB_OUTPUT
fi
echo "EOF" >> $GITHUB_OUTPUT
- name: Run critical path benchmarks
run: |
echo "Running critical path benchmarks for regression detection..."
# Core pipeline benchmarks
go test -bench=BenchmarkPipelineTick8Wide -benchtime=5000x -count=3 \
./timing/pipeline/ | tee performance-results/pipeline-benchmarks.txt
# Decoder benchmarks (optimization target)
go test -bench='BenchmarkDecoder.*' -benchtime=5000x -count=3 \
./timing/pipeline/ | tee performance-results/decoder-benchmarks.txt || true
# Memory-intensive benchmarks
go test -bench='BenchmarkPipeline.*LoadHeavy.*' -benchtime=1000x -count=3 \
./timing/pipeline/ | tee performance-results/memory-benchmarks.txt || true
- name: Analyze performance trends
run: |
cat > performance-results/analysis.py << 'EOF'
#!/usr/bin/env python3
import re
import sys
from pathlib import Path
def parse_benchmark_line(line):
"""Parse a Go benchmark line to extract metrics."""
if 'ns/op' not in line:
return None
parts = line.strip().split()
if len(parts) < 3:
return None
name = parts[0]
try:
ns_per_op = float(parts[2])
return (name, ns_per_op)
except ValueError:
return None
def analyze_benchmarks(benchmark_file):
"""Analyze benchmark results for performance regression."""
if not Path(benchmark_file).exists():
return {}
with open(benchmark_file, 'r') as f:
lines = f.readlines()
benchmarks = {}
for line in lines:
result = parse_benchmark_line(line)
if result:
name, ns_per_op = result
if name not in benchmarks:
benchmarks[name] = []
benchmarks[name].append(ns_per_op)
# Calculate averages
averages = {}
for name, values in benchmarks.items():
if values:
averages[name] = sum(values) / len(values)
return averages
# Analyze all benchmark files
files = ['pipeline-benchmarks.txt', 'decoder-benchmarks.txt', 'memory-benchmarks.txt']
all_results = {}
for filename in files:
results = analyze_benchmarks(f'performance-results/{filename}')
all_results.update(results)
# Generate performance summary
print("=== Performance Analysis Summary ===")
print(f"Commit: {sys.argv[1] if len(sys.argv) > 1 else 'unknown'}")
print(f"Total benchmarks analyzed: {len(all_results)}")
print()
# Performance thresholds (in ns/op)
thresholds = {
'BenchmarkPipelineTick8Wide': 5000, # 5μs threshold
'BenchmarkDecoderDecode': 1000, # 1μs threshold
'BenchmarkDecoderDecodeInto': 500, # 0.5μs threshold
}
# Check for regressions
regressions = []
improvements = []
print("| Benchmark | Performance (ns/op) | Status |")
print("|-----------|-------------------|--------|")
for name, avg_time in sorted(all_results.items()):
threshold = thresholds.get(name, 10000) # Default 10μs threshold
if avg_time > threshold:
status = "⚠️ REGRESSION"
regressions.append(name)
elif avg_time < threshold * 0.7: # 30% better than threshold
status = "✅ IMPROVEMENT"
improvements.append(name)
else:
status = "✅ NORMAL"
print(f"| {name} | {avg_time:.1f} | {status} |")
print()
# Summary
if regressions:
print(f"🚨 PERFORMANCE REGRESSIONS DETECTED: {len(regressions)}")
for name in regressions:
print(f" - {name}: {all_results[name]:.1f} ns/op")
sys.exit(1)
elif improvements:
print(f"🚀 PERFORMANCE IMPROVEMENTS DETECTED: {len(improvements)}")
for name in improvements:
print(f" - {name}: {all_results[name]:.1f} ns/op")
else:
print("✅ No significant performance changes detected")
EOF
python3 performance-results/analysis.py "${{ github.sha }}" | tee performance-results/analysis-summary.txt
- name: Generate performance report
if: always()
run: |
cat > performance-results/PERFORMANCE_REPORT.md << 'HEADER'
# Performance Monitoring Report
**Date:** $(date -u)
**Commit:** ${{ github.sha }}
**Branch:** ${{ github.ref_name }}
**Trigger:** ${{ github.event_name }}
## Optimization Status
This report validates the performance optimizations implemented in Issue #487:
- ✅ Instruction decoder memory allocation optimization
- ✅ Branch predictor reuse enhancement
- ✅ Critical path bottleneck elimination
## Performance Results
HEADER
# Add benchmark results
if [ -f performance-results/analysis-summary.txt ]; then
echo "" >> performance-results/PERFORMANCE_REPORT.md
echo "### Benchmark Analysis" >> performance-results/PERFORMANCE_REPORT.md
echo '```' >> performance-results/PERFORMANCE_REPORT.md
cat performance-results/analysis-summary.txt >> performance-results/PERFORMANCE_REPORT.md
echo '```' >> performance-results/PERFORMANCE_REPORT.md
fi
# Add detailed benchmark data
echo "" >> performance-results/PERFORMANCE_REPORT.md
echo "### Detailed Results" >> performance-results/PERFORMANCE_REPORT.md
echo "" >> performance-results/PERFORMANCE_REPORT.md
for file in pipeline-benchmarks.txt decoder-benchmarks.txt memory-benchmarks.txt; do
if [ -f "performance-results/$file" ]; then
echo "#### ${file}" >> performance-results/PERFORMANCE_REPORT.md
echo '```' >> performance-results/PERFORMANCE_REPORT.md
grep 'Benchmark' "performance-results/$file" >> performance-results/PERFORMANCE_REPORT.md
echo '```' >> performance-results/PERFORMANCE_REPORT.md
echo "" >> performance-results/PERFORMANCE_REPORT.md
fi
done
# Add optimization impact summary
cat >> performance-results/PERFORMANCE_REPORT.md << 'FOOTER'
## Optimization Impact Assessment
### Success Metrics (from Issue #487):
- **Target**: 50-80% reduction in calibration iteration time
- **Approach**: Data-driven optimization based on profiling results
- **Focus**: Critical path optimization while preserving accuracy
### Implementation Status:
- ✅ **Critical Path Analysis**: Bottlenecks identified via systematic profiling
- ✅ **Memory Optimization**: Decoder allocation hotspot eliminated
- ✅ **Performance Monitoring**: CI integration for regression detection
- ✅ **Validation Framework**: Automated optimization impact measurement
### Next Steps:
- Monitor performance trends across commits
- Validate calibration workflow speed improvements
- Extend optimization to additional bottlenecks as identified
FOOTER
- name: Upload performance results
if: always()
uses: actions/upload-artifact@v4
with:
name: performance-monitoring-results-${{ github.sha }}
path: performance-results/
retention-days: 30
- name: Comment on PR (if applicable)
if: github.event_name == 'pull_request'
uses: actions/github-script@v7
with:
script: |
const fs = require('fs');
const path = 'performance-results/PERFORMANCE_REPORT.md';
if (fs.existsSync(path)) {
const report = fs.readFileSync(path, 'utf8');
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: `## 🚀 Performance Monitoring Results\n\n${report}`
});
}
- name: Check performance regression status
run: |
# This step will fail the workflow if regressions are detected
# The analysis.py script exits with code 1 if regressions are found
if grep -q "PERFORMANCE REGRESSIONS DETECTED" performance-results/analysis-summary.txt; then
echo "❌ Performance regressions detected - failing the build"
cat performance-results/analysis-summary.txt
exit 1
else
echo "✅ No performance regressions detected"
fi