fix: Use map for efficient array reconiliation#2280
fix: Use map for efficient array reconiliation#2280yossivainshtein wants to merge 1 commit intomobxjs:masterfrom
Conversation
|
Thanks @yossivainshtein - this is great. I will take a look once you mark it as ready for review. |
thegedge
left a comment
There was a problem hiding this comment.
Like @coolsoftwaretyler mentioned, we can give a full review when you're ready, @yossivainshtein.
Thanks for your contribution! I've dropped in some thoughts that may help improve your PR :)
| // Creates a map of node by identifier value. | ||
| // In theory, several nodes can have the same identifier, if the array contains different types, so every identifier is mapped to an array of nodes with the same values. | ||
| // In practice this in probably a rare case, so we can live with the performance hit. | ||
| // | ||
| // If not all nodes have identifier, we can't use the map for lookups, so we return null. |
There was a problem hiding this comment.
Love that you've thought through this!
I'd highly recommend capturing various edge cases in test (if we haven't already done so). Some cases that come to mind would be tests for arrays of:
- all scalars,
modeltypes without identifiers,modeltypes with identifiers,- union of two distinct
modeltypes with and without identifiers - union of scalar and
modeltype (I believe MST supports this, but double check).
There was a problem hiding this comment.
Thanks, of course these should be tested. I'll try to add those soon
|
|
||
| function buildObjectByIdMap(nodes: AnyNode[]): [Set<string>, Map<string, Array<AnyNode>> | null] { | ||
| // Creates a map of node by identifier value. | ||
| // In theory, several nodes can have the same identifier, if the array contains different types, so every identifier is mapped to an array of nodes with the same values. |
There was a problem hiding this comment.
This line of your comment is an interesting one. Basically, it states that we still have the same worst case complexity.
That being said, in those scenarios we're likely to fall back to the old code path. In the case where we have an array with models of the same type containing identifiers, we get a nice perf boost 🚀
(nothing to action here, I just like talking things through)
There was a problem hiding this comment.
yes, I'm trying to understand when it's OK to use the map and when it isn't...
|
Thanks for the comments @thegedge , @coolsoftwaretyler ! I obviously should add some tests for various edge cases, I hope to get to it soon. I made it draft because I wanted to make sure first that this direction which I took makes sense. Perhaps I should have opened a discussion (not sure how to do that). Another question - My test is more of a performance test, not really a UT. I saw that there's a separate file for performance tests but i couldn't get it to work... do they work? |
|
Hey @yossivainshtein - you can always open discussions at https://github.com/mobxjs/mobx-state-tree/discussions, but I like collaborating in a PR anyways. I'm not sure if the perf tests work. They've been in the codebase a long time. I wouldn't use them as a bar to clear for a PR at the moment, although for this change it makes sense to:
|
What does this PR do and why?
This PR attempts to fix this performance issue when replacing large array.
When replacing an array which contains object with an identifier attribute, it's possible to lookup the object for reconciliation by using a map lookup instead of linear scan of the entire array.
Steps to validate locally
I added a new test case named
it should reconciliate large array instances efficiently.This test takes a long time before the fix and its quick after.