Skip to content

Replace BTreeMap usage with hashbrown hashmap#221

Merged
alexggh merged 1 commit intomasterfrom
alexggh/switch_memory_db_to_hashbrown
May 23, 2025
Merged

Replace BTreeMap usage with hashbrown hashmap#221
alexggh merged 1 commit intomasterfrom
alexggh/switch_memory_db_to_hashbrown

Conversation

@alexggh
Copy link
Copy Markdown
Contributor

@alexggh alexggh commented May 22, 2025

Discovered while profiling paritytech/polkadot-sdk#6131 (comment) with the benchmark paritytech/polkadot-sdk#8069 that when running in validation a big chunk of the time is spent inserting and retrieving data from the BTreeMap/BTreeSet.

This PR together with paritytech/polkadot-sdk#8606 improve read costs with around ~40% and write costs with about ~20%

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>

#[cfg(not(feature = "std"))]
use alloc::collections::btree_map::{BTreeMap as Map, Entry};
use hashbrown::{hash_map::Entry, HashMap as Map};
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just use it on std and no_std. The std hashmap is actually hashbrown any way.

@alexggh alexggh merged commit 40dd6df into master May 23, 2025
4 checks passed
@alexggh alexggh mentioned this pull request May 23, 2025
github-merge-queue bot pushed a commit to paritytech/polkadot-sdk that referenced this pull request May 23, 2025
Discovered while profiling
#6131 (comment)
with the benchmark #8069
that when running in validation a big chunk of the time is spent
inserting and retrieving data from the BTreeMap/BTreeSet.

By switching to hashbrown HashMap/HashSet in validation TrieCache and
TrieRecorder and the memory-db
paritytech/trie#221 read costs improve with
around ~40% and write with about ~20%

---------

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
github-merge-queue bot pushed a commit to paritytech/polkadot-sdk that referenced this pull request May 27, 2025
Bump memory-db to pick up
#8606 and
paritytech/trie#221

Additionally, polkavm needs to be bumped to get rid of to get rid of
https://github.com/paritytech/polkadot-sdk/actions/runs/15180236627/job/42688141374#step:5:1869

---------

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
pgherveou pushed a commit to paritytech/polkadot-sdk that referenced this pull request Jun 11, 2025
Discovered while profiling
#6131 (comment)
with the benchmark #8069
that when running in validation a big chunk of the time is spent
inserting and retrieving data from the BTreeMap/BTreeSet.

By switching to hashbrown HashMap/HashSet in validation TrieCache and
TrieRecorder and the memory-db
paritytech/trie#221 read costs improve with
around ~40% and write with about ~20%

---------

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
pgherveou pushed a commit to paritytech/polkadot-sdk that referenced this pull request Jun 11, 2025
Bump memory-db to pick up
#8606 and
paritytech/trie#221

Additionally, polkavm needs to be bumped to get rid of to get rid of
https://github.com/paritytech/polkadot-sdk/actions/runs/15180236627/job/42688141374#step:5:1869

---------

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
github-merge-queue bot pushed a commit to paritytech/polkadot-sdk that referenced this pull request Jul 11, 2025
…lidation context (#9127)

#8606
paritytech/trie#221 replaced the usage of
BTreeMap with HashMaps in validation context. The keys are already
derived with a cryptographic hash function from user data, so users
should not be able to manipulate it.

To be on safe side this PR also modifies the TrieCache, TrieRecorder and
MemoryDB to use a hasher that on top of the default generated randomness
also adds randomness generated from the hash of the relaychain and that
of the parachain blocks, which is not something users can control or
guess ahead of time.

---------

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Co-authored-by: Bastian Köcher <git@kchr.de>
Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
paritytech-release-backport-bot bot pushed a commit to paritytech/polkadot-sdk that referenced this pull request Jul 11, 2025
…lidation context (#9127)

#8606
paritytech/trie#221 replaced the usage of
BTreeMap with HashMaps in validation context. The keys are already
derived with a cryptographic hash function from user data, so users
should not be able to manipulate it.

To be on safe side this PR also modifies the TrieCache, TrieRecorder and
MemoryDB to use a hasher that on top of the default generated randomness
also adds randomness generated from the hash of the relaychain and that
of the parachain blocks, which is not something users can control or
guess ahead of time.

---------

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Co-authored-by: Bastian Köcher <git@kchr.de>
Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
(cherry picked from commit 7058819)
paritytech-release-backport-bot bot pushed a commit to paritytech/polkadot-sdk that referenced this pull request Jul 17, 2025
…lidation context (#9127)

#8606
paritytech/trie#221 replaced the usage of
BTreeMap with HashMaps in validation context. The keys are already
derived with a cryptographic hash function from user data, so users
should not be able to manipulate it.

To be on safe side this PR also modifies the TrieCache, TrieRecorder and
MemoryDB to use a hasher that on top of the default generated randomness
also adds randomness generated from the hash of the relaychain and that
of the parachain blocks, which is not something users can control or
guess ahead of time.

---------

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Co-authored-by: Bastian Köcher <git@kchr.de>
Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
(cherry picked from commit 7058819)
alvicsam pushed a commit to paritytech/polkadot-sdk that referenced this pull request Oct 17, 2025
Discovered while profiling
#6131 (comment)
with the benchmark #8069
that when running in validation a big chunk of the time is spent
inserting and retrieving data from the BTreeMap/BTreeSet.

By switching to hashbrown HashMap/HashSet in validation TrieCache and
TrieRecorder and the memory-db
paritytech/trie#221 read costs improve with
around ~40% and write with about ~20%

---------

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
alvicsam pushed a commit to paritytech/polkadot-sdk that referenced this pull request Oct 17, 2025
Bump memory-db to pick up
#8606 and
paritytech/trie#221

Additionally, polkavm needs to be bumped to get rid of to get rid of
https://github.com/paritytech/polkadot-sdk/actions/runs/15180236627/job/42688141374#step:5:1869

---------

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
alvicsam pushed a commit to paritytech/polkadot-sdk that referenced this pull request Oct 17, 2025
…lidation context (#9127)

#8606
paritytech/trie#221 replaced the usage of
BTreeMap with HashMaps in validation context. The keys are already
derived with a cryptographic hash function from user data, so users
should not be able to manipulate it.

To be on safe side this PR also modifies the TrieCache, TrieRecorder and
MemoryDB to use a hasher that on top of the default generated randomness
also adds randomness generated from the hash of the relaychain and that
of the parachain blocks, which is not something users can control or
guess ahead of time.

---------

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Co-authored-by: Bastian Köcher <git@kchr.de>
Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants