Skip to content

feat: add strict JSON parsing with String/CharBuffer support#974

Open
Dee66 wants to merge 3 commits intozio:mainfrom
Dee66:feature/issue-712-json-codec
Open

feat: add strict JSON parsing with String/CharBuffer support#974
Dee66 wants to merge 3 commits intozio:mainfrom
Dee66:feature/issue-712-json-codec

Conversation

@Dee66
Copy link
Copy Markdown

@Dee66 Dee66 commented Dec 31, 2025

Reject JSON input with trailing characters after valid JSON.

Fixes #712

/claim #712

@Dee66
Copy link
Copy Markdown
Author

Dee66 commented Dec 31, 2025

demo.mp4

@Dee66
Copy link
Copy Markdown
Author

Dee66 commented Jan 11, 2026

Hi @jdegoes, I just wanted to clear up the confusion regarding the bounty.

This work fulfills the 'Strict JSON Parsing' bounty. The active Algora claim is here: https://algora.io/claims/qd2AdUcS1Scxy7df

I originally opened this in zio-blocks #571, but moved the implementation here to zio-schema following @plokhotnyuk's suggestion that it was a better architectural fit for JsonBinaryCodec.

The logic is complete and there are no merge conflicts. It should be ready to merge or update directly from your end. Let me know if everything looks good to get this settled.

Copy link
Copy Markdown
Contributor

@987Nabil 987Nabil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New decode methods have nothing to do with the issue and are not needed

@Dee66 Dee66 force-pushed the feature/issue-712-json-codec branch from faeddf9 to c10d722 Compare January 24, 2026 08:37
@Dee66
Copy link
Copy Markdown
Author

Dee66 commented Jan 24, 2026

@987Nabil

  1. removed the methods as requested
  2. reject JSON with trailing characters - Core fix that adds EOF checking to reject malformed JSON like {}} and "foo""
  3. add type annotations for scala 3 - Compatibility fix adding explicit type annotations for Scala 3 implicit resolution

@Dee66
Copy link
Copy Markdown
Author

Dee66 commented Mar 2, 2026

Hi @987Nabil , this code has been live for some time, please can this ticket be closed?

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to fix #712 by making JSON decoding reject inputs that contain non-whitespace trailing characters after an otherwise-valid JSON value.

Changes:

  • Added a new decoder test suite that asserts trailing characters cause decode failures for several JSON shapes.
  • Updated zioJsonBinaryCodec.decode to decode via a FastStringReader and explicitly check for leftover non-whitespace content.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
zio-schema-json/shared/src/main/scala/zio/schema/codec/JsonCodec.scala Implements a post-decode “no trailing content” check for zioJsonBinaryCodec.decode.
zio-schema-json/shared/src/test/scala/zio/schema/codec/JsonCodecSpec.scala Adds regression tests covering trailing-character rejection for the zio-json-based binary codec.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +171 to +189
override def decode(whole: Chunk[Byte]): Either[DecodeError, A] = {
val reader = new FastStringReader(new String(whole.toArray, JsonEncoder.CHARSET))
try {
val result = jsonCodec.decoder.unsafeDecode(Nil, reader)
// Verify no trailing content after valid JSON
try {
val next = reader.nextNonWhitespace()
if (next != -1) {
Left(DecodeError.ReadError(Cause.empty, s"Invalid JSON: found '${next.toChar}' after valid content"))
} else {
Right(result)
}
} catch {
case _: UnexpectedEnd => Right(result)
}
} catch {
case UnsafeJson(trace) => Left(DecodeError.ReadError(Cause.empty, JsonError.render(trace)))
}
}
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This strict trailing-character check is only applied in zioJsonBinaryCodec.decode. The schema-based entrypoints (schemaBasedBinaryCodec.decode and JsonCodec.JsonDecoder.decode, which still delegate to decodeJson) appear unchanged and may therefore continue accepting inputs like {}} / "foo"". To fully address #712, consider centralizing the strict parse logic and using it for all decode paths (including schemaBasedBinaryCodec and any String/CharBuffer decoders).

Copilot uses AI. Check for mistakes.
case _: UnexpectedEnd => Right(result)
}
} catch {
case UnsafeJson(trace) => Left(DecodeError.ReadError(Cause.empty, JsonError.render(trace)))
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsafeDecode is called directly here, but the catch block only handles UnsafeJson. If unsafeDecode (or nextNonWhitespace) throws another NonFatal exception, it will escape the codec and potentially fail the fiber rather than returning a DecodeError. Consider catching NonFatal as well and mapping it to DecodeError.ReadError (while still letting fatal errors propagate).

Suggested change
case UnsafeJson(trace) => Left(DecodeError.ReadError(Cause.empty, JsonError.render(trace)))
case UnsafeJson(trace) =>
Left(DecodeError.ReadError(Cause.empty, JsonError.render(trace)))
case NonFatal(e) =>
Left(DecodeError.ReadError(Cause.fail(e), Option(e.getMessage).getOrElse(e.toString)))

Copilot uses AI. Check for mistakes.
Comment on lines +178 to +180
if (next != -1) {
Left(DecodeError.ReadError(Cause.empty, s"Invalid JSON: found '${next.toChar}' after valid content"))
} else {
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trailing-content error message interpolates next.toChar directly. For whitespace/control characters this can be hard to read (or misleading) in logs. Consider rendering a safe/escaped representation (and/or the numeric code point) so the user can identify the offending character reliably.

Copilot uses AI. Check for mistakes.
Comment on lines +1607 to +1652
assertTrue(
bc.decode(Chunk.fromArray("""{"name":"Alice","age":30}}""".getBytes(java.nio.charset.StandardCharsets.UTF_8)))
.isLeft,
bc.decode(Chunk.fromArray("""{"name":"Alice","age":30}""".getBytes(java.nio.charset.StandardCharsets.UTF_8))) == Right(
Person("Alice", 30)
)
)
},
test("string") {
implicit val jc: zio.json.JsonCodec[String] = JsonCodec.jsonCodec(Schema[String])
val bc = JsonCodec.zioJsonBinaryCodec[String]
assertTrue(
bc.decode(Chunk.fromArray("\"foo\"\"".getBytes(java.nio.charset.StandardCharsets.UTF_8))).isLeft,
bc.decode(Chunk.fromArray("\"foo\"".getBytes(java.nio.charset.StandardCharsets.UTF_8))) == Right("foo")
)
},
test("array") {
implicit val jc: zio.json.JsonCodec[List[Int]] = JsonCodec.jsonCodec(Schema[List[Int]])
val bc = JsonCodec.zioJsonBinaryCodec[List[Int]]
assertTrue(
bc.decode(Chunk.fromArray("[1,2]]".getBytes(java.nio.charset.StandardCharsets.UTF_8))).isLeft,
bc.decode(Chunk.fromArray("[1,2]".getBytes(java.nio.charset.StandardCharsets.UTF_8))) == Right(List(1, 2))
)
},
test("whitespace then garbage") {
implicit val jc: zio.json.JsonCodec[Int] = JsonCodec.jsonCodec(Schema[Int])
val bc = JsonCodec.zioJsonBinaryCodec[Int]
assertTrue(
bc.decode(Chunk.fromArray("42 x".getBytes(java.nio.charset.StandardCharsets.UTF_8))).isLeft,
bc.decode(Chunk.fromArray("42 ".getBytes(java.nio.charset.StandardCharsets.UTF_8))) == Right(42)
)
},
test("multiple trailing") {
implicit val jc: zio.json.JsonCodec[Boolean] = JsonCodec.jsonCodec(Schema[Boolean])
val bc = JsonCodec.zioJsonBinaryCodec[Boolean]
assertTrue(
bc.decode(Chunk.fromArray("true}}".getBytes(java.nio.charset.StandardCharsets.UTF_8))).isLeft
)
},
test("empty with trailing") {
implicit val jc: zio.json.JsonCodec[Map[String, String]] = JsonCodec.jsonCodec(Schema[Map[String, String]])
val bc = JsonCodec.zioJsonBinaryCodec[Map[String, String]]
assertTrue(
bc.decode(Chunk.fromArray("{}}}".getBytes(java.nio.charset.StandardCharsets.UTF_8))).isLeft,
bc.decode(Chunk.fromArray("{}".getBytes(java.nio.charset.StandardCharsets.UTF_8))) == Right(
Map.empty[String, String]
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These new trailing-character tests exercise zioJsonBinaryCodec, but the main public API in this module is often schemaBasedBinaryCodec (and JsonDecoder.decode). Adding equivalent assertions for the schema-based decoder would ensure the regression is covered for all supported decode entrypoints.

Suggested change
assertTrue(
bc.decode(Chunk.fromArray("""{"name":"Alice","age":30}}""".getBytes(java.nio.charset.StandardCharsets.UTF_8)))
.isLeft,
bc.decode(Chunk.fromArray("""{"name":"Alice","age":30}""".getBytes(java.nio.charset.StandardCharsets.UTF_8))) == Right(
Person("Alice", 30)
)
)
},
test("string") {
implicit val jc: zio.json.JsonCodec[String] = JsonCodec.jsonCodec(Schema[String])
val bc = JsonCodec.zioJsonBinaryCodec[String]
assertTrue(
bc.decode(Chunk.fromArray("\"foo\"\"".getBytes(java.nio.charset.StandardCharsets.UTF_8))).isLeft,
bc.decode(Chunk.fromArray("\"foo\"".getBytes(java.nio.charset.StandardCharsets.UTF_8))) == Right("foo")
)
},
test("array") {
implicit val jc: zio.json.JsonCodec[List[Int]] = JsonCodec.jsonCodec(Schema[List[Int]])
val bc = JsonCodec.zioJsonBinaryCodec[List[Int]]
assertTrue(
bc.decode(Chunk.fromArray("[1,2]]".getBytes(java.nio.charset.StandardCharsets.UTF_8))).isLeft,
bc.decode(Chunk.fromArray("[1,2]".getBytes(java.nio.charset.StandardCharsets.UTF_8))) == Right(List(1, 2))
)
},
test("whitespace then garbage") {
implicit val jc: zio.json.JsonCodec[Int] = JsonCodec.jsonCodec(Schema[Int])
val bc = JsonCodec.zioJsonBinaryCodec[Int]
assertTrue(
bc.decode(Chunk.fromArray("42 x".getBytes(java.nio.charset.StandardCharsets.UTF_8))).isLeft,
bc.decode(Chunk.fromArray("42 ".getBytes(java.nio.charset.StandardCharsets.UTF_8))) == Right(42)
)
},
test("multiple trailing") {
implicit val jc: zio.json.JsonCodec[Boolean] = JsonCodec.jsonCodec(Schema[Boolean])
val bc = JsonCodec.zioJsonBinaryCodec[Boolean]
assertTrue(
bc.decode(Chunk.fromArray("true}}".getBytes(java.nio.charset.StandardCharsets.UTF_8))).isLeft
)
},
test("empty with trailing") {
implicit val jc: zio.json.JsonCodec[Map[String, String]] = JsonCodec.jsonCodec(Schema[Map[String, String]])
val bc = JsonCodec.zioJsonBinaryCodec[Map[String, String]]
assertTrue(
bc.decode(Chunk.fromArray("{}}}".getBytes(java.nio.charset.StandardCharsets.UTF_8))).isLeft,
bc.decode(Chunk.fromArray("{}".getBytes(java.nio.charset.StandardCharsets.UTF_8))) == Right(
Map.empty[String, String]
val sbc = JsonCodec.schemaBasedBinaryCodec[Person](personSchema)
assertTrue(
bc.decode(Chunk.fromArray("""{"name":"Alice","age":30}}""".getBytes(java.nio.charset.StandardCharsets.UTF_8)))
.isLeft,
sbc.decode(Chunk.fromArray("""{"name":"Alice","age":30}}""".getBytes(java.nio.charset.StandardCharsets.UTF_8)))
.isLeft,
bc.decode(Chunk.fromArray("""{"name":"Alice","age":30}""".getBytes(java.nio.charset.StandardCharsets.UTF_8))) == Right(
Person("Alice", 30)
),
sbc.decode(Chunk.fromArray("""{"name":"Alice","age":30}""".getBytes(java.nio.charset.StandardCharsets.UTF_8))) == Right(
Person("Alice", 30)
)
)
},
test("string") {
implicit val jc: zio.json.JsonCodec[String] = JsonCodec.jsonCodec(Schema[String])
val bc = JsonCodec.zioJsonBinaryCodec[String]
val sbc = JsonCodec.schemaBasedBinaryCodec[String](Schema[String])
assertTrue(
bc.decode(Chunk.fromArray("\"foo\"\"".getBytes(java.nio.charset.StandardCharsets.UTF_8))).isLeft,
sbc.decode(Chunk.fromArray("\"foo\"\"".getBytes(java.nio.charset.StandardCharsets.UTF_8))).isLeft,
bc.decode(Chunk.fromArray("\"foo\"".getBytes(java.nio.charset.StandardCharsets.UTF_8))) == Right("foo"),
sbc.decode(Chunk.fromArray("\"foo\"".getBytes(java.nio.charset.StandardCharsets.UTF_8))) == Right("foo")
)
},
test("array") {
implicit val jc: zio.json.JsonCodec[List[Int]] = JsonCodec.jsonCodec(Schema[List[Int]])
val bc = JsonCodec.zioJsonBinaryCodec[List[Int]]
val sbc = JsonCodec.schemaBasedBinaryCodec[List[Int]](Schema[List[Int]])
assertTrue(
bc.decode(Chunk.fromArray("[1,2]]".getBytes(java.nio.charset.StandardCharsets.UTF_8))).isLeft,
sbc.decode(Chunk.fromArray("[1,2]]".getBytes(java.nio.charset.StandardCharsets.UTF_8))).isLeft,
bc.decode(Chunk.fromArray("[1,2]".getBytes(java.nio.charset.StandardCharsets.UTF_8))) == Right(List(1, 2)),
sbc.decode(Chunk.fromArray("[1,2]".getBytes(java.nio.charset.StandardCharsets.UTF_8))) == Right(List(1, 2))
)
},
test("whitespace then garbage") {
implicit val jc: zio.json.JsonCodec[Int] = JsonCodec.jsonCodec(Schema[Int])
val bc = JsonCodec.zioJsonBinaryCodec[Int]
val sbc = JsonCodec.schemaBasedBinaryCodec[Int](Schema[Int])
assertTrue(
bc.decode(Chunk.fromArray("42 x".getBytes(java.nio.charset.StandardCharsets.UTF_8))).isLeft,
sbc.decode(Chunk.fromArray("42 x".getBytes(java.nio.charset.StandardCharsets.UTF_8))).isLeft,
bc.decode(Chunk.fromArray("42 ".getBytes(java.nio.charset.StandardCharsets.UTF_8))) == Right(42),
sbc.decode(Chunk.fromArray("42 ".getBytes(java.nio.charset.StandardCharsets.UTF_8))) == Right(42)
)
},
test("multiple trailing") {
implicit val jc: zio.json.JsonCodec[Boolean] = JsonCodec.jsonCodec(Schema[Boolean])
val bc = JsonCodec.zioJsonBinaryCodec[Boolean]
val sbc = JsonCodec.schemaBasedBinaryCodec[Boolean](Schema[Boolean])
assertTrue(
bc.decode(Chunk.fromArray("true}}".getBytes(java.nio.charset.StandardCharsets.UTF_8))).isLeft,
sbc.decode(Chunk.fromArray("true}}".getBytes(java.nio.charset.StandardCharsets.UTF_8))).isLeft
)
},
test("empty with trailing") {
implicit val jc: zio.json.JsonCodec[Map[String, String]] = JsonCodec.jsonCodec(Schema[Map[String, String]])
val bc = JsonCodec.zioJsonBinaryCodec[Map[String, String]]
val sbc = JsonCodec.schemaBasedBinaryCodec[Map[String, String]](Schema[Map[String, String]])
assertTrue(
bc.decode(Chunk.fromArray("{}}}".getBytes(java.nio.charset.StandardCharsets.UTF_8))).isLeft,
sbc.decode(Chunk.fromArray("{}}}".getBytes(java.nio.charset.StandardCharsets.UTF_8))).isLeft,
bc.decode(Chunk.fromArray("{}".getBytes(java.nio.charset.StandardCharsets.UTF_8))) == Right(
Map.empty[String, String]
),
sbc.decode(Chunk.fromArray("{}".getBytes(java.nio.charset.StandardCharsets_UTF_8))) == Right(
Map.empty[String, String]

Copilot uses AI. Check for mistakes.
Comment on lines +1607 to +1621
assertTrue(
bc.decode(Chunk.fromArray("""{"name":"Alice","age":30}}""".getBytes(java.nio.charset.StandardCharsets.UTF_8)))
.isLeft,
bc.decode(Chunk.fromArray("""{"name":"Alice","age":30}""".getBytes(java.nio.charset.StandardCharsets.UTF_8))) == Right(
Person("Alice", 30)
)
)
},
test("string") {
implicit val jc: zio.json.JsonCodec[String] = JsonCodec.jsonCodec(Schema[String])
val bc = JsonCodec.zioJsonBinaryCodec[String]
assertTrue(
bc.decode(Chunk.fromArray("\"foo\"\"".getBytes(java.nio.charset.StandardCharsets.UTF_8))).isLeft,
bc.decode(Chunk.fromArray("\"foo\"".getBytes(java.nio.charset.StandardCharsets.UTF_8))) == Right("foo")
)
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new tests repeat Chunk.fromArray(...getBytes(StandardCharsets.UTF_8)) in multiple places. This spec already uses helpers like charSequenceToByteChunk(...) for readability and consistency; consider using the existing helper here to reduce noise and avoid repeating charset plumbing in each assertion.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

JsonCodec decoder accepts malformed JSON string with extra characters

3 participants