Experiment With Amortized Freeing

https://dl.acm.org/doi/pdf/10.1145/3627535.3638491 suggests that batch freeing bypasses thread-local allocator buffers, and freeing from a remote thread, which is extremely expensive (note that mimalloc avoids this problems, but about every other allocator is affected). Amortized freeing can improve both latency as well as throughput.