Add WeakBoundedBTreeMap and use it for nomination-pools SubPools::with_era#12303
Add WeakBoundedBTreeMap and use it for nomination-pools SubPools::with_era#12303cirko33 wants to merge 2 commits into
WeakBoundedBTreeMap and use it for nomination-pools SubPools::with_era#12303Conversation
…with_era` Adds a `WeakBoundedBTreeMap` to `frame-support`, the map counterpart of the existing `WeakBoundedVec`. Its bound `S` is not strictly enforced on decoding: a map encoded with more entries than `S` decodes successfully (logging a warning) instead of failing. All mutating operations still respect the bound, so the map compacts back to within `S` on the next mutation, and it encodes identically to `BoundedBTreeMap`/`BTreeMap` (drop-in, storage-compatible). Uses it for nomination-pools `SubPools::with_era`, which is bounded by the runtime-dynamic `TotalUnbondingPools` (derived from `bonding_duration()`). When the effective bonding duration shrinks at runtime, the previous strict `BoundedBTreeMap` made any oversized historical `with_era` map undecodable: `SubPoolsStorage::get` silently returned `None` and the next `unbond` overwrote the entry, destroying per-era unbonding accounting. With the weak variant the oversized map decodes intact, and `unbond` first calls `SubPools::merge_oldest_until_fits` to fold the oldest pools into `no_era` so the insert succeeds instead of hitting `NotEnoughSpaceInUnbondPool`. Loss-less, no-op under a stable bound, and no storage migration required.
|
All GitHub workflows were cancelled due to failure one of the required jobs. |
There was a problem hiding this comment.
It might be better to make the weak bounded btree map as a separate PR?
There was a problem hiding this comment.
yes I agree - I have commented here as well - but I wanted to explore an alternative to decouple the bound from the flag so the bound never shrinks - as I was mentioning in the original issue's comment but haven't had time to do it so to see if it makes sense 😅
TL;DR have bonding_duration sized to MaxUnbonding = BondingDuration + PostUnbondingPoolsWindow constant instead of the dynamic TotalUnbondingPools. So the bound doesn't shrink on the flip.
THis would be a quick win for 2.3.1 before flipping the flag. Then in parallel we can introduce the weak bounded map to live happy vs governance lowering BondingDuration.
There was a problem hiding this comment.
if we want to have 2.3.1 very soon to unlock the nominator flip, maybe it's ok to deploy the static max for bonding duration and then having the weak map in a followup runtime upgrade
There was a problem hiding this comment.
TL;DR have bonding_duration sized to MaxUnbonding = BondingDuration + PostUnbondingPoolsWindow constant instead of the dynamic TotalUnbondingPools. So the bound doesn't shrink on the flip.
THis would be a quick win for 2.3.1 before flipping the flag. Then in parallel we can introduce the weak bounded map to live happy vs governance lowering BondingDuration.
I think this is a great idea, and probably better even in the long term: decouple MaxUnbonding from BondingDuration.
Concretely: introduce a fixed MaxUnbondingPools constant (e.g. 32 = current 28 + 4) and use it in both places that currently derive from bonding_duration:
- The
with_erabound (TotalUnbondingPools). - The merge window in
maybe_merge_pools. Today it merges atcurrent_era - PostUnbondingPoolsWindow, which collapses post-flip: withbonding_durationdropping to 2, members' clean per-era pools get merged intono_erain only 6 eras compared to 32 before, and any later withdrawal dissolves against the skewedno_eraratio (plus potential slashes from other eras). A decoupledMaxUnbondingkeeps pools on their correct per-era ratio for ~32 eras regardless of bonding duration.
This should also get rid of PostUnbondingPoolsWindow.
There was a problem hiding this comment.
The runtime level ugly patch would be
PostUnbondingPoolsWindow = if AreNominatorsSlashable { 4 } else { 30 }
Will get same behaviour as above.
There was a problem hiding this comment.
Some extra notes:
- dropping weak type makes also @ggwpez happy vs Add
WeakBoundedBTreeMapand use it for nomination-poolsSubPools::with_era#12303 (comment) since we make benchmark proper again - if we dropped
PostUnbondingPoolsWindowcompletely (and we could), it would be a breaking change in the config so would need a storage version bump - should we add some
try_stateinvariant to checkMaxUnbondingPools - NominatorFastUnbondDuration > SlashDeferDurationor similar?
There was a problem hiding this comment.
True about
PostUnbondingPoolsWindowbeing a breaking change. On second thought,PostUnbondingPoolsWindowandMaxUnbondingPoolsare basically same, so we can just keep the old name. Wdyt @sigurpol @cirko33 ?
I think we can do the next:
- Now upgrade only logic and numbers with comments explaining why name didn't change
- Make a separate PR changing a name of parameter for future release
There was a problem hiding this comment.
The name can stay I think. PostUnbondingPoolsWindow kinda indicates the same, the excess post bonding period that we keep the pools unmerged.
| @@ -0,0 +1,424 @@ | |||
| // This file is part of Substrate. | |||
There was a problem hiding this comment.
we should probably define with WeakBoundedBTreeMap in bounded-collections crate probably - similarly to weak bounded vec and the other siblings do (extend crate + re-export here
for sp-runtimeTIL we have also https://github.com/paritytech/bounded-collections-next
| pub with_era: BoundedBTreeMap<EraIndex, UnbondPool<T>, TotalUnbondingPools<T>>, | ||
| /// [`WeakBoundedBTreeMap`] allows decoding if `TotalUnbondingPools` shrinks, preventing loss | ||
| /// of unbonding data. | ||
| pub with_era: WeakBoundedBTreeMap<EraIndex, UnbondPool<T>, TotalUnbondingPools<T>>, |
There was a problem hiding this comment.
This wont have a MaxEncodedLen anymore if we use a Weak type, so the PoV benchmarking cannot get the worst case anymore.
Generally we only use Weak types as intermediary solution, not as permanent ones. Not sure if there could be a permanent migration that automatically migrates these if the length is too long.
Depends on the effort to do that.
There was a problem hiding this comment.
I believe we can live w/o weak types here as per thread #12303 (comment)
What does this PR do?
Introduces a
WeakBoundedBTreeMapstorage type inframe-supportand uses itfor the nomination-pools
SubPools::with_eramap to prevent loss of unbondingdata when the bonding duration shrinks at runtime.
WeakBoundedBTreeMapThe map counterpart of the existing
WeakBoundedVec. It behaves like aBoundedBTreeMapexcept that the boundSis not strictly enforced ondecoding:
Ssucceeds and logs a warning,instead of failing.
try_insert,try_mutate, ...) still respect thebound, so the map is compacted back to within
Son the next mutation.BoundedBTreeMap/BTreeMap, so it is a drop-in,storage-compatible replacement requiring no migration.
The bug it fixes
SubPools::with_erais bounded byTotalUnbondingPools, which isruntime-dynamic: it is derived from
T::StakeAdapter::bonding_duration().When the effective bonding duration shrinks (e.g. when nominator fast unbonding
is enabled), the bound shrinks with it.
With the previous strict
BoundedBTreeMap, any pool whose historicalwith_eramap already held more entries than the new, smaller bound could nolonger be decoded:
SubPoolsStorage::getsilently returnedNone("Corrupted state").unbondwouldunwrap_or_default()and overwrite the entry,permanently destroying the pool's per-era unbonding accounting (the
underlying staking-ledger chunks survived, but the pool could no longer
attribute them).
The fix
with_eranow usesWeakBoundedBTreeMap, so the oversized historical mapdecodes intact (no funds lost). The read/remove paths (
withdraw_unbonded,on_slash,PoolMember::total_balance) operate on it correctly.unbond— now first callsSubPools::merge_oldest_until_fits, which folds the oldest pools intono_erauntil there is room. (maybe_merge_poolsonly merges by era ageand does not free a slot when stale entries are far-future-dated.) This makes
the first post-shrink
unbondsucceed instead of hitting the defensiveNotEnoughSpaceInUnbondPoolpath.The change is loss-less, a no-op under a stable bonding duration, and the
encoded representation is unchanged.
Changes
frame-support: newstorage::weak_bounded_btree_map::WeakBoundedBTreeMap,re-exported from the crate root and
pallet_prelude, plusSealedimpl.pallet-nomination-pools:SubPools::with_eraswitched toWeakBoundedBTreeMap; newmerge_oldest_until_fitshelper called inunbond; tests added.