Fix MySQL profiling issues: double delays, stale counters, and custom sync API#271
Merged
emeryberger merged 1 commit intomasterfrom Feb 25, 2026
Merged
Conversation
…nters, and custom sync API
Address four issues reported when profiling MySQL after the LIEF/macOS port:
1a. Guard progress-point delay check with __APPLE__. On Linux, delays are
already applied in the SIGPROF handler; calling add_delays() again at
every COZ_PROGRESS hit caused double application and TPS collapse.
1b. Remove per-experiment local_delay sync block added in the macOS port.
Forcing all threads to the same baseline caused a thundering herd of
nanosleep calls under high concurrency. The existing cool-off period
between experiments already naturally syncs threads via add_delays().
4. Call process_samples() (not just add_delays()) in catch_up() and
post_block() on Linux. Samples accumulate in per-thread perf_event
buffers between 10ms timer signals; processing them ensures delay
counters are current before unblocking other threads (BCOZ fix).
3. Add COZ_PRE_BLOCK, COZ_CATCH_UP, COZ_POST_BLOCK(skip_delays) macros
and corresponding _coz_pre_block/_coz_post_block exports for programs
using custom synchronization not intercepted by Coz (e.g., MySQL
mutexes, RocksDB internal locks).
2. Add linear regression slope and R-squared columns to `coz plot --text`
output to help users assess result reliability and optimization impact.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Addresses four issues reported when profiling MySQL with Coz after the LIEF/macOS port (PR #265):
Fix double delay application on Linux (Issue 1a): Guard the progress-point
_call_coz_add_delays()call with__APPLE__. On Linux, delays are already applied in the SIGPROF handler; the unconditional call at everyCOZ_PROGRESShit caused double application and TPS collapse under high concurrency.Remove per-experiment delay sync block (Issue 1b): The forced sync of all threads'
local_delaytoglobal_delaybefore each experiment (added in the macOS port) caused a thundering herd of nanosleep calls. The existing cool-off period between experiments already naturally syncs threads.Process pending samples in
catch_up()andpost_block()on Linux (Issue 4, BCOZ fix): Samples accumulate in per-thread perf_event buffers between 10ms timer signals. Now callsprocess_samples()(not justadd_delays()) to ensure delay counters are current before unblocking other threads.Add custom synchronization API (Issue 3): New
COZ_PRE_BLOCK,COZ_CATCH_UP,COZ_POST_BLOCK(skip_delays)macros for programs using synchronization not intercepted by Coz (e.g., MySQL mutexes, RocksDB internal locks).Add slope and R² to
coz plot --text(Issue 2): Linear regression slope and R-squared columns help users assess result reliability.Test plan
cmake . && make— clean build on macOS (no new warnings)coz run --- ./benchmarks/toy/toy— profile output produced with experiments, new Slope/R² columns display correctlycoz run --- ./benchmarks/lock_test/lock_test— mutex-heavy benchmark profiles correctly (exercises catch_up/post_block changes and validates removed delay sync doesn't break results)🤖 Generated with Claude Code