It's generally expected that when std collections return a capacity of zero, that they are not backed by any allocation. I believe this was also true for the HashMap prior to hashbrown, although I could be wrong.
In any case, whilst HashMap::capacity() is documented as a lower bound I still find it extremely surprising that it reports a capacity of zero without freeing its memory.
You can reproduce the issue with the following program (using mockalloc to detect the leaked memory):
use std::alloc::System;
use std::collections::hash_map::{DefaultHasher, HashMap};
use std::hash::BuildHasherDefault;
use mockalloc::Mockalloc;
#[global_allocator]
static ALLOCATOR: Mockalloc<System> = Mockalloc(System);
const NUMBERS: &[usize] = &[
25, 38, 41, 42, 89, 115, 184, 237, 273, 286, 300, 326, 377, 413, 482, 536, 602, 650, 702, 746,
750, 807, 810, 836, 960, 979, 982, 1007,
];
fn main() {
let mut guard =
HashMap::<String, (), _>::with_hasher(BuildHasherDefault::<DefaultHasher>::default());
mockalloc::assert_allocs(|| {
for &n in NUMBERS {
let k = n.to_string();
guard.insert(k, ());
}
for &n in NUMBERS {
let k = n.to_string();
guard.remove(&k);
if guard.len() * 3 < guard.capacity() {
guard.shrink_to_fit();
}
}
eprintln!("Capacity: {}", guard.capacity());
//guard.shrink_to_fit();
});
}
Given these lines:
if guard.len() * 3 < guard.capacity() {
guard.shrink_to_fit();
}
I expected all the memory to be reclaimed by the end of the loop. However, this is not the case: the memory is only reclaimed if shrink_to_fit() is called an additional time.
IMO, the fix should be two-fold:
-
When the last element of a HashMap is removed, it should automatically clear out any tombstones so that the capacity() method is accurate again.
-
A new method should be introduced to obtain the upper bound on the capacity. At the moment code which attempts to shrink under-utilized hashmaps is broken, and it's not possible to fix this code without a new method.
It's generally expected that when std collections return a capacity of zero, that they are not backed by any allocation. I believe this was also true for the HashMap prior to hashbrown, although I could be wrong.
In any case, whilst
HashMap::capacity()is documented as a lower bound I still find it extremely surprising that it reports a capacity of zero without freeing its memory.You can reproduce the issue with the following program (using mockalloc to detect the leaked memory):
Given these lines:
I expected all the memory to be reclaimed by the end of the loop. However, this is not the case: the memory is only reclaimed if
shrink_to_fit()is called an additional time.IMO, the fix should be two-fold:
When the last element of a HashMap is removed, it should automatically clear out any tombstones so that the
capacity()method is accurate again.A new method should be introduced to obtain the upper bound on the capacity. At the moment code which attempts to shrink under-utilized hashmaps is broken, and it's not possible to fix this code without a new method.