feat: speed up PersistentHashMapDeserializer by miikka · Pull Request #92 · metosin/jsonista

miikka · 2026-03-19T14:32:38Z

A speculative benchmark optimization for the decode code path. I found this by running pi-autoresearch with gpt-5.4-medium against the benchmark. The idea is similar in spirit to #17 (which didn't work) but the PersistentArrayMap construction is inlined

This is what I got for the jsonista.jmh/decode-jsonista benchmark on my laptop, master vs this branch:

Size	Main	PR	Change
10b	9,285,018.601 ops/s	11,050,553.982 ops/s	+19.01%
100b	2,026,489.218 ops/s	2,367,886.089 ops/s	+16.85%
1k	363,486.397 ops/s	418,933.267 ops/s	+15.25%
10k	32,911.114 ops/s	37,307.349 ops/s	+13.36%
100k	3,342.422 ops/s	3,644.178 ops/s	+9.03%

I'm not sure what to make of this - is this a good idea, or is it possibly overfitting to the benchmark dataset?

Version info

% java --version
openjdk 25.0.2 2026-01-20
OpenJDK Runtime Environment Homebrew (build 25.0.2)
OpenJDK 64-Bit Server VM Homebrew (build 25.0.2, mixed mode, sharing)
% clj --version
Clojure CLI version 1.12.4.1618

opqdonut · 2026-03-20T06:01:25Z

Interesting that this would work while #17 didn't. How could we figure out if it makes sense to ship this?

miikka · 2026-03-20T09:23:09Z

We could try running a benchmark against a bit bigger benchmark suite - I need to go spelunking the Internet to see if there are any ready-to-use ones - and see if the results still hold (+ possibly see about different Java versions). The change seems sensible to me so if it holds on a bigger benchmark, then I'd ship it.

opqdonut · 2026-03-20T09:30:27Z

That sounds good. The code makes sense and is a nice surgical change. I'd like to simplify the diff by removing the nested whiles and the <<2s before merging though.

opqdonut · 2026-03-20T09:31:46Z

Oh and a comment explaining that this is an optimisation would probably also be apropos.

miikka · 2026-03-20T09:32:34Z

I'll edit it for clarity if the benchmarking pays off

Deraen · 2026-03-20T09:41:05Z

src/java/jsonista/jackson/PersistentHashMapDeserializer.java

+        for (int i = 0; i < size << 1; i += 2) {
+          t = t.assoc(entries[i], entries[i + 1]);
+        }


It seems to me like this for loop is run again for each kv-pair after the size of temporary Object size goes over 8? Though I would think the temporary Object contents need to be copied into the transient only once.

Or maybe there is something I don't understand.

Okay now I think I got it.

There is similar while loop after this copy step, reading until JSON Object end. So the above while loop will not continue when this else branch is hit.

Yep, this is why I said I want to simplify the control flow in the diff. I imagine it could be something like

while (size < 8) { if END_OBJECT { return PersistentArrayMap(...) } Object key = ... Object val = ... entries[size/2] = key entries[size/2+1] = val } PersistentHashMap phm = ... while (! END_OBJECT) { phm.assoc(key, val) } return phm

miikka · 2026-03-31T14:19:07Z

I gathered a new benchmark suite from two sources:

Native JSON Benchmark has three fairly big JSON files
json-size-benchmark has a collection of JSON files of different sizes and shapes, mostly fairly small.

The benchmark setup can be seen here: miikka#2.

I ran the benchmarks on a Linux server with different JDKs (Oracle/Temurin/Corretto, 21/25/26). The results were positive but mixed: while the patch improved the average performance across the whole suite in almost all runs¹, typically the performance suffered in some cases. Native JSON Benchmark's canada.json seems to consistently suffer from this patch.

In any case, I have now collected a bunch of data and I'm realizing that I'm out of my depth in making sense of microbenchmark results! I still think the patch makes sense, but the situation is much less clear-cut than I hoped for.

Examples of benchmark results:

Benchmark results, Temurin 25

Run 1: 2026-03-25 09:05:55 · c393cbd2 chore: now with all benchmarks · gob-jsonista · Temurin 25.0.2
Run 2: 2026-03-25 10:05:38 · ede5c6d8 feat: speed up PersistentHashMapDeserializer · gob-jsonista · Temurin 25.0.2

Benchmark	Params	Run 1 (ops/s)	Run 2 (ops/s)	Δ (%)
decode	"json-size-benchmark/circleciblank"	2.5M	2.83M	+13.4%
decode	"json-size-benchmark/circlecimatrix"	504K	607K	+20.5%
decode	"json-size-benchmark/commitlint"	590K	642K	+8.7%
decode	"json-size-benchmark/commitlintbasic"	2.81M	2.97M	+5.6%
decode	"json-size-benchmark/epr"	190K	212K	+11.3%
decode	"json-size-benchmark/eslintrc"	84.6K	93.2K	+10.1%
decode	"json-size-benchmark/esmrc"	695K	822K	+18.3%
decode	"json-size-benchmark/geojson"	137K	137K	+0.2%
decode	"json-size-benchmark/githubfundingblank"	587K	608K	+3.6%
decode	"json-size-benchmark/githubworkflow"	254K	307K	+21.1%
decode	"json-size-benchmark/gruntcontribclean"	623K	722K	+15.9%
decode	"json-size-benchmark/imageoptimizerwebjob"	728K	755K	+3.7%
decode	"json-size-benchmark/jsonereversesort"	602K	732K	+21.5%
decode	"json-size-benchmark/jsonesort"	1.32M	1.43M	+8.4%
decode	"json-size-benchmark/jsonfeed"	304K	346K	+13.5%
decode	"json-size-benchmark/jsonresume"	53.4K	57.6K	+7.9%
decode	"json-size-benchmark/netcoreproject"	136K	129K	-5.0%
decode	"json-size-benchmark/nightwatch"	72.5K	73.6K	+1.6%
decode	"json-size-benchmark/openweathermap"	149K	163K	+9.6%
decode	"json-size-benchmark/openweatherroadrisk"	167K	184K	+10.4%
decode	"json-size-benchmark/packagejson"	66.8K	69.9K	+4.7%
decode	"json-size-benchmark/packagejsonlintrc"	82.5K	81.1K	-1.7%
decode	"json-size-benchmark/sapcloudsdkpipeline"	1.77M	2.07M	+17.2%
decode	"json-size-benchmark/travisnotifications"	278K	383K	+37.9%
decode	"json-size-benchmark/tslintbasic"	1.04M	1.4M	+34.4%
decode	"json-size-benchmark/tslintextend"	1.4M	1.75M	+24.6%
decode	"json-size-benchmark/tslintmulti"	642K	773K	+20.3%
decode	"json100b"	716K	854K	+19.2%
decode	"json100k"	1.42K	1.67K	+17.3%
decode	"json10b"	3.31M	4.02M	+21.4%
decode	"json10k"	14K	16.7K	+18.8%
decode	"json1k"	145K	172K	+18.3%
decode	"nativejson-benchmark/canada"	20.29	20.07	-1.1%
decode	"nativejson-benchmark/citm_catalog"	135.7	156.9	+15.6%
decode	"nativejson-benchmark/twitter"	301.2	304.4	+1.1%
				+12.4%

Benchmark results, Oracle Java 26

Run 1: 2026-03-26 06:59:54 · c393cbd2 chore: now with all benchmarks · gob-jsonista · java version "26" 2026-03-17
Run 2: 2026-03-26 07:59:35 · ede5c6d8 feat: speed up PersistentHashMapDeserializer · gob-jsonista · java version "26" 2026-03-17

Benchmark	Params	Run 1 (ops/s)	Run 2 (ops/s)	Δ (%)
decode	"json-size-benchmark/circleciblank"	2.58M	3.04M	+17.9%
decode	"json-size-benchmark/circlecimatrix"	478K	610K	+27.6%
decode	"json-size-benchmark/commitlint"	616K	654K	+6.1%
decode	"json-size-benchmark/commitlintbasic"	2.93M	3.59M	+22.3%
decode	"json-size-benchmark/epr"	195K	221K	+13.2%
decode	"json-size-benchmark/eslintrc"	92.3K	102K	+11.0%
decode	"json-size-benchmark/esmrc"	648K	875K	+34.9%
decode	"json-size-benchmark/geojson"	142K	137K	-3.5%
decode	"json-size-benchmark/githubfundingblank"	599K	574K	-4.2%
decode	"json-size-benchmark/githubworkflow"	259K	310K	+19.6%
decode	"json-size-benchmark/gruntcontribclean"	651K	753K	+15.8%
decode	"json-size-benchmark/imageoptimizerwebjob"	744K	803K	+8.0%
decode	"json-size-benchmark/jsonereversesort"	638K	779K	+22.1%
decode	"json-size-benchmark/jsonesort"	1.32M	1.49M	+13.2%
decode	"json-size-benchmark/jsonfeed"	306K	355K	+15.8%
decode	"json-size-benchmark/jsonresume"	56.3K	57.1K	+1.5%
decode	"json-size-benchmark/netcoreproject"	120K	134K	+11.2%
decode	"json-size-benchmark/nightwatch"	71K	76.2K	+7.3%
decode	"json-size-benchmark/openweathermap"	152K	157K	+3.3%
decode	"json-size-benchmark/openweatherroadrisk"	169K	191K	+13.4%
decode	"json-size-benchmark/packagejson"	69.3K	69.8K	+0.8%
decode	"json-size-benchmark/packagejsonlintrc"	86.2K	85.2K	-1.1%
decode	"json-size-benchmark/sapcloudsdkpipeline"	1.78M	2.1M	+17.9%
decode	"json-size-benchmark/travisnotifications"	289K	362K	+25.6%
decode	"json-size-benchmark/tslintbasic"	1.11M	1.49M	+34.3%
decode	"json-size-benchmark/tslintextend"	1.64M	1.85M	+13.4%
decode	"json-size-benchmark/tslintmulti"	668K	770K	+15.3%
decode	"json100b"	800K	874K	+9.3%
decode	"json100k"	1.5K	1.6K	+7.0%
decode	"json10b"	3.34M	3.87M	+15.9%
decode	"json10k"	14.5K	16K	+9.9%
decode	"json1k"	152K	177K	+16.6%
decode	"nativejson-benchmark/canada"	33.23	33.16	-0.2%
decode	"nativejson-benchmark/citm_catalog"	142.2	160.7	+13.0%
decode	"nativejson-benchmark/twitter"	317.9	320.8	+0.9%
				+12.0%

I reported to @opqdonut that in my first run with Java 26, the existing code was faster than the patch! However, this turned out to be a fluke and in the runs afterwards, the patch was faster. ↩

Deraen · 2026-04-01T10:50:23Z

One example of performance critical JSON parsing case: reading large GeoJSON files. They also repeat a lot of patterns, so would be interesting to see if that has an effect.

And well, the nativejson-benchmark already has 2MB canada.json file. I might be also interested if there is a difference in ~100-1000MB files.

opqdonut

I think I'm happy with this

feat: speed up PersistentHashMapDeserializer

2c0c92c

opqdonut added this to Metosin Open Source Backlog Mar 20, 2026

opqdonut moved this to 📬 Inbox in Metosin Open Source Backlog Mar 20, 2026

Deraen reviewed Mar 20, 2026

View reviewed changes

refactor: make the PHM optimization easier to read

b17b194

opqdonut approved these changes Apr 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: speed up PersistentHashMapDeserializer#92

feat: speed up PersistentHashMapDeserializer#92
miikka wants to merge 2 commits intometosin:masterfrom
miikka:speed-up-phm

miikka commented Mar 19, 2026 •

edited

Loading

Uh oh!

opqdonut commented Mar 20, 2026

Uh oh!

miikka commented Mar 20, 2026

Uh oh!

opqdonut commented Mar 20, 2026

Uh oh!

opqdonut commented Mar 20, 2026

Uh oh!

miikka commented Mar 20, 2026

Uh oh!

Deraen Mar 20, 2026

Uh oh!

Deraen Mar 20, 2026

Uh oh!

opqdonut Mar 20, 2026 •

edited

Loading

Uh oh!

miikka commented Mar 31, 2026

Uh oh!

Deraen commented Apr 1, 2026

Uh oh!

opqdonut left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

miikka commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

opqdonut commented Mar 20, 2026

Uh oh!

miikka commented Mar 20, 2026

Uh oh!

opqdonut commented Mar 20, 2026

Uh oh!

opqdonut commented Mar 20, 2026

Uh oh!

miikka commented Mar 20, 2026

Uh oh!

Deraen Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Deraen Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

opqdonut Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

miikka commented Mar 31, 2026

Footnotes

Uh oh!

Deraen commented Apr 1, 2026

Uh oh!

opqdonut left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

miikka commented Mar 19, 2026 •

edited

Loading

opqdonut Mar 20, 2026 •

edited

Loading