Optimize World::get_entity_mut for large entity slices by CrazyRoka · Pull Request #23740 · bevyengine/bevy

CrazyRoka · 2026-04-09T20:41:00Z

Objective

Optimize World::get_entity_mut (and the underlying WorldEntityFetch trait) when a slice of entities is passed in.
The previous duplicate-entity check used nested loops (O(N²)) and showed up as a CPU hotspot in real workloads with thousands of entities.

This PR makes the check O(N) while preserving exact behaviour and error semantics.

Solution

Replaced the nested for i in 0..len { for j in 0..i } duplicate check with a single-pass EntityHashSet in both &[Entity] and &[Entity; N] implementations of WorldEntityFetch::fetch_mut.
Added a dedicated Criterion benchmark (get_entity_mut_slice) that exercises the hot path with slices up to 2000 entities.

Testing

Ran the new benchmark before and after the change using cargo bench -p benches --bench ecs -- get_entity_mut_slice.
Tested on Linux (x86_64).

Showcase

Benchmark results before:

Benchmark results after:

Criterion table summary:

get_entity_mut_slice/size/10
                        time:   [56.617 ns 56.850 ns 57.099 ns]
                        change: [−4.0898% −3.4947% −2.9431%] (p = 0.00 < 0.05)
                        Performance has improved.
get_entity_mut_slice/size/20
                        time:   [142.19 ns 142.94 ns 143.75 ns]
                        change: [−9.3294% −8.3968% −7.4851%] (p = 0.00 < 0.05)
                        Performance has improved.
get_entity_mut_slice/size/40
                        time:   [450.04 ns 453.50 ns 457.16 ns]
                        change: [−3.7575% −2.6321% −1.4291%] (p = 0.00 < 0.05)
                        Performance has improved.
get_entity_mut_slice/size/60
                        time:   [855.45 ns 856.69 ns 857.96 ns]
                        change: [−5.6733% −5.1486% −4.6343%] (p = 0.00 < 0.05)
                        Performance has improved.
get_entity_mut_slice/size/80
                        time:   [1.1187 µs 1.1206 µs 1.1227 µs]
                        change: [−20.884% −20.493% −20.138%] (p = 0.00 < 0.05)
                        Performance has improved.
get_entity_mut_slice/size/100
                        time:   [1.4017 µs 1.4034 µs 1.4050 µs]
                        change: [−31.999% −31.427% −30.927%] (p = 0.00 < 0.05)
                        Performance has improved.
get_entity_mut_slice/size/200
                        time:   [2.7256 µs 2.7365 µs 2.7481 µs]
                        change: [−58.082% −57.776% −57.488%] (p = 0.00 < 0.05)
                        Performance has improved.
get_entity_mut_slice/size/400
                        time:   [5.4878 µs 5.5039 µs 5.5213 µs]
                        change: [−76.805% −76.556% −76.311%] (p = 0.00 < 0.05)
                        Performance has improved.
get_entity_mut_slice/size/600
                        time:   [8.2935 µs 8.3305 µs 8.3663 µs]
                        change: [−83.367% −83.196% −83.039%] (p = 0.00 < 0.05)
                        Performance has improved.
get_entity_mut_slice/size/800
                        time:   [10.978 µs 11.020 µs 11.068 µs]
                        change: [−87.050% −86.950% −86.857%] (p = 0.00 < 0.05)
                        Performance has improved.
get_entity_mut_slice/size/1000
                        time:   [13.534 µs 13.551 µs 13.570 µs]
                        change: [−89.591% −89.525% −89.465%] (p = 0.00 < 0.05)
                        Performance has improved.
get_entity_mut_slice/size/2000
                        time:   [26.999 µs 27.039 µs 27.083 µs]
                        change: [−94.705% −94.666% −94.631%] (p = 0.00 < 0.05)
                        Performance has improved.

Victoronz · 2026-04-09T23:04:40Z

Good observation that EntityHashSet is more efficient for greater entity quantities!
These impls were originally only intended for low N, so that use case should not see regression.
The regression can be addressed by using the previous duplicate check below a certain N.

I'll also note that the vast, vast majority of arrays are small, so the EntityHashSet branch there should practically never be hit.

As an isolated change this makes sense (and we can merge it in the meantime)!
Overall though, I am still of the opinion that implicit get_entity_mut for slices is iffy design as was previously discussed here (and its follow-up PR).

I am curious, in the use case you've seen, is the entire result always used, or only partially iterated/consumed?

If only partial use is needed, then an iteration-based duplication check would be more performant.

Additionally, if the source slices/arrays are not mutated between each get_entity_mut call, then placing the check after each mutation (by turning it into an EntitySet) could also reduce unnecessary work.

Use HashSet for O(N) duplicate entity detection if number of entities exceeds the threshold. This improves performance significantly for larger entity lists.

CrazyRoka · 2026-04-10T09:55:08Z

@Victoronz Thanks for the feedback. I updated the benchmarks to include smaller values (20, 40, 60, 80) and added small threshold (40) to fallback for O(N^2) comparisons. There are no performance regressions anymore (check the PR description, it was updated today)

kfc35 added C-Performance A change motivated by improving speed, memory usage or compile times S-Needs-Review Needs reviewer attention (from anyone!) to move forward A-ECS Entities, components, systems, and events labels Apr 9, 2026

github-project-automation bot added this to ECS Apr 9, 2026

github-project-automation bot moved this to Needs SME Triage in ECS Apr 9, 2026

Victoronz added S-Waiting-on-Author The author needs to make changes or address concerns before this can be merged and removed S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Apr 9, 2026

CrazyRoka added 2 commits April 10, 2026 10:27

Add benchmark for WorldEntityFetch::fetch_mut

c252abd

Optimize duplicate entity checks

9f9c0a9

Use HashSet for O(N) duplicate entity detection if number of entities exceeds the threshold. This improves performance significantly for larger entity lists.

CrazyRoka force-pushed the optimize-entity-fetch-mut branch from d8831de to 9f9c0a9 Compare April 10, 2026 09:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize World::get_entity_mut for large entity slices#23740

Optimize World::get_entity_mut for large entity slices#23740
CrazyRoka wants to merge 2 commits intobevyengine:mainfrom
CrazyRoka:optimize-entity-fetch-mut

CrazyRoka commented Apr 9, 2026 •

edited

Loading

Uh oh!

Victoronz commented Apr 9, 2026

Uh oh!

CrazyRoka commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

CrazyRoka commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Objective

Solution

Testing

Showcase

Benchmark results before:

Benchmark results after:

Criterion table summary:

Uh oh!

Victoronz commented Apr 9, 2026

Uh oh!

CrazyRoka commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CrazyRoka commented Apr 9, 2026 •

edited

Loading