Conventions for agent loops running in this workspace.
Mirrors the style of ffc-agent-workspace/AGENTS.md and redis-agent-workspace/AGENTS.md.
Push libevent's hot paths beyond the current baseline: the evbuffer data plane
(buffer.c), socket I/O batching (evbuffer_read/evbuffer_write_atmost), and
event-loop dispatch / epoll_ctl churn (event.c, evmap.c, epoll.c).
This is a pure OSS effort against upstream libevent. There is no downstream-consumer
patch in scope — work against redis-performance/libevent (a fork of OSS upstream).
Every code change must pass both steps before being accepted:
-
Benchmark — run
scripts/run-bench.sh; compare events/sec, ns/op, syscall count- Must improve at least one workload without regressing others
- Never report MB/s — libevent is an event loop, not a parser
- Results go in the experiment log entry in
experiments/EXPERIMENTS.md
-
Profile — run
scripts/run-profile.sh(4-phase); compare hot symbols- Confirm the expected bottleneck shifted
- Capture key
perf reportlines + the cycle-budget kernel% share
An approach that improves benchmark numbers but reveals a new surprising bottleneck is a partial win — document the new bottleneck and continue.
Single env var AGENT selects the backend:
AGENT=claude(default) — usesclaudeCLI in non-interactive modeAGENT=codex(planned)AGENT=aider(planned)
Skills under .claude/skills/*.md are plain markdown prompts.
All memory files live in .workspace-memory/ so every agent backend shares context.
MEMORY.md is the index; one file per memory entry.
When running autonomously: commit any .workspace-memory/ updates back to main
in the same commit as the experiment results. Git log is the audit trail.
- Edit
libevent/*.c/libevent/*.h— never editlibevent/test/bench*.c(the benchmark harness is immutable — never modify it to make libevent look better) - Each implementer variant runs in its own git worktree with an isolated CMake build
dir — never let 3 variants build in the same tree (filesystem races,
-marchcache corruption) - Always run
scripts/verify-correctness.shbefore logging a benchmark result — correctness first, no exceptions. A bad edit can deadlock the loop; per-test timeouts guard it. - Log every experiment in
experiments/EXPERIMENTS.md— failures are valuable - Keep
experiments/SUMMARY.mdandREADME.mdcounts in sync after each decision - Accept → commit to the
libeventsubmodule (tip = best accepted state). Reject → discard the worktree. Never force-push tomain. - After a permanent dead end: add to "Known Non-Starters" in
.claude/program.md
This workspace runs entirely locally — no remote runners required.
clang/gcc,cmake≥ 3.5,makeperf(Linux kernel tools) —sudo apt install linux-tools-generictaskset(util-linux) for core pinningpython3≥ 3.10 (forllm-call.pytoken accounting)claudeCLI —npm i -g @anthropic-ai/claude-code
For perf counter access (IPC, branch mispredicts, cache misses, syscall tracepoints):
echo -1 | sudo tee /proc/sys/kernel/perf_event_paranoidNone — this workspace is fully OSS and runs locally.
Set ANTHROPIC_API_KEY or use CLAUDE_CODE_OAUTH_TOKEN for the claude CLI / llm-call.py.
- The benchmark uses libevent's shipped OSS benchmarks (
bench,bench_cascade,bench_http), built via CMake with-DEVENT__DISABLE_BENCHMARK=OFF - Build with
RelWithDebInfo -march=native -fno-omit-frame-pointerso perf has frames sudois needed forperfhardware counters and syscall tracepoints- Pin to fixed cores (
taskset) and take the median of ≥ 3 runs — event-loop benchmarks are noisier than parser benchmarks; kernel scheduling and frequency scaling add variance - Validate every delta with two runs:
perf-on for direction,perf-off for the true number