Replies: 1 comment 8 replies
-
|
It is fairly rare that software needs fixes to such low-level performance issues (cache layout, etc.). I found that almost all of the time when people were investigating perf issues this way, they failed to make major improvements. Those instead came from actually rewriting the bad code from scratch. But that takes a lot of effort, so people rarely try, and what could've been a 10x improvement ends up being a 10% micro-optimization. I say this, because Windows Terminal is one such case. Rewriting the text buffer and VT parser would yield a ~10x perf improvement according to my investigation. I do not believe that this can be achieved any other way. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I’ve come across this tool: GitHub - abokhalill/lshaz: Find the microarchitectural performance bugs hiding in your C++ code · GitHub
It claims to detect latency hazards at compile-time: struct layouts that span cache lines, atomics that trigger cross-core invalidation storms, heap allocations in hot loops, virtual dispatch that defeats branch prediction, etc.
Granted, it seems oriented towards x64, but it claims to also support arm64. In addition, it has GH workflow files to run this tool on PRs.
Beta Was this translation helpful? Give feedback.
All reactions