Add CBOR kotlinx-io support by cedrickcooke · Pull Request #3149 · Kotlin/kotlinx.serialization

cedrickcooke · 2026-02-08T05:11:17Z

Looking to follow in the footsteps of #2707 with CBOR support for kotlinx-io.

(Sorry for the noise re #3148, accidentally pointed it at the wrong branch and couldn't find the UI to change it.)

pdvrieze

As more formats support kotlinx-io they all need some sort of text or binary input/output abstraction. It makes sense that this is more or less "internal", but such abstractions are not format specific at all. As such it makes more sense to put them into the core module (with a serialization internal api flag).

At some level it might even make sense to have these interfaces as public, but that may be better for a different discussion.

pdvrieze · 2026-02-09T11:41:29Z

formats/cbor/commonMain/src/kotlinx/serialization/cbor/internal/Streams.kt

+@CborFriendModuleApi
+public interface Input {
+    public val availableBytes: Int
+    /** Returns a -1 if no bytes are available. Otherwise returns a value between 0 and 255 (inclusive). */
+    public fun read(): Int
+    public fun read(b: ByteArray, offset: Int, length: Int): Int
+    public fun skip(length: Int)
+}
+
+@CborFriendModuleApi
+public interface Output {
+    public fun write(buffer: ByteArray, offset: Int = 0, count: Int = buffer.size)
+    public fun write(byteValue: Byte)
+}


These types appear to be format independent and relevant to all binary formats. It would be worthwhile to consider whether these interfaces should be put in a common context.

Should I handle that as part of this PR? My original intent with this API was to be as non-invasive as possible. If so, I'm happy to make the change but I'm not sure where it should live. formats/io?

I would suggest leaving these base Input/Output interface in the core module, and change the parameter of BinaryFormat and StringFormat to use them.

I think this is a separate discussion. We probably we want string and binary outputs in core, so we can then plug in whatever we want. Providing shims for ByteArrayOutput/input and Sink/Source (also for string-based IO) can then also move to core and core-io.

Just my 2 cents

pdvrieze · 2026-02-09T11:43:15Z

formats/cbor/commonMain/src/kotlinx/serialization/cbor/internal/Streams.kt

+    public val availableBytes: Int
+    /** Returns a -1 if no bytes are available. Otherwise returns a value between 0 and 255 (inclusive). */


It is not clear what this means. What is the meaning of availableBytes==0 and how do you deal with (pending) network reads where the bytes are not available yet (and stream size is not known).

Sorry, I didn't do the best job on the docs for Input/Output, I'll go back and update them 😅.

This one comment was me attempting to describe the pre-existing behavior of read in ByteArrayInput, since it took me a minute to figure it out. I had originally thought that this should return a Byte (the 0-255 value returned on the happy path), but there's a failure path if there's no more data available where it returns -1 (which forces the Int typing).

Re. network streams, I don't think this is even handled by kotlinx-io yet? It seems like the APIs presented are all synchronous (until Kotlin/kotlinx-io#163 has an implementation, at least).

In most cases (including Java IO streams) returning 0 means that nothing was returned, but another call may return something again (although most API's that support blocking will block instead of returning 0). A negative (or -1) return value tends to mean end of file/stream.

sandwwraith · 2026-02-23T13:54:27Z

RE: Input/Output general interfaces

When kotlinx-io is 1.0, that will no longer be needed. Its Sink and Source can be directly added to a BinaryFormat or BinaryIoFormat. Implementing encoding to/from ByteArray should be trivial via sinks, so it would be enough for the implementation to write/read Sink/Source.

sandwwraith

In short, the current Input/ByteArrayInput API does not fit the task because it was created to mimic the Java stream API, and kotlinx-io has different ideas in mind. Try to form the API based on what the actual CborDecoder needs.

Also, it may interfere with other work in CBOR (#3036), so maybe it's worth coordinating work in advance, or waiting for the PR instead. cc @JesusMcCloud @fzhinkin

sandwwraith · 2026-02-23T14:11:19Z

formats/cbor-io/commonMain/src/kotlinx/serialization/cbor/io/internal/IoCborStreams.kt

+
+internal class IoStreamInput(private val source: Source): Input {
+    override val availableBytes: Int
+        get() = source.peek().readByteArray().size


Using peek and reading whole input to the memory is a huge potential performance problem. exhausted() should be used instead. Yes, it returns Boolean, so calls of availableBytes should also be adapted.

sandwwraith · 2026-02-23T14:13:50Z

formats/cbor-io/commonMain/src/kotlinx/serialization/cbor/io/internal/IoCborStreams.kt

+        get() = source.peek().readByteArray().size
+
+    override fun read(): Int =
+        try {


Using try-catch on the hot path is also a performance problem. read() seems to be called only from methods reading byte/int/long/etc. These methods should be adapted and use corresponding Sink methods instead (readByte()/readInt()/etc). It will also (probably) automatically give us IOException without the need to use availableBytes — see above.

sandwwraith · 2026-02-23T14:16:50Z

formats/cbor-io/commonMain/src/kotlinx/serialization/cbor/io/internal/IoCborStreams.kt

+        }
+
+    override fun read(b: ByteArray, offset: Int, length: Int): Int =
+        source.readAtMostTo(b, startIndex = offset, endIndex = offset + length)


Semantic is different: readAtMostTo can read fewer bytes than length in order not to block, while Input.read expects all the bytes. ~~In this case, a reading loop should be implemented. @fzhinkin I do not remember why we don't have readExactTo in Source?~~

We have Source.readTo(sink: ByteArray, startIndex: Int = 0, endIndex: Int = sink.size). Use it instead. Return value of read(b, offset, len) is not needed anyway.

sandwwraith · 2026-02-23T14:17:49Z

formats/cbor-io/src/commonTest/kotlin/kotlinx/serialization/cbor/io/IoTests.kt

+import kotlinx.serialization.cbor.*
+import kotlin.test.*
+
+class IoTests {


Preferably, the whole test suite should be adapted to work with IO streams; see how it is done in JsonTestBase with JsonTestingMode.

Also, a pair of basic benchmarks (see CborBaseline.kt) would be nice.

sandwwraith · 2026-02-23T14:21:24Z

formats/cbor/commonMain/src/kotlinx/serialization/cbor/internal/Encoder.kt


-
-//value classes are only inlined on the JVM, so we use a typealias and extensions instead
-private typealias Stack = MutableList<CborWriter.Data>


This change is not related and I prefer to leave the typealias for easier reading.

JesusMcCloud · 2026-02-25T14:47:56Z

formats/cbor/commonMain/src/kotlinx/serialization/cbor/internal/Streams.kt

+@CborFriendModuleApi
+public interface Input {
+    public val availableBytes: Int
+    /** Returns a -1 if no bytes are available. Otherwise returns a value between 0 and 255 (inclusive). */
+    public fun read(): Int
+    public fun read(b: ByteArray, offset: Int, length: Int): Int
+    public fun skip(length: Int)
+}
+
+@CborFriendModuleApi
+public interface Output {
+    public fun write(buffer: ByteArray, offset: Int = 0, count: Int = buffer.size)
+    public fun write(byteValue: Byte)
+}


I think this is a separate discussion. We probably we want string and binary outputs in core, so we can then plug in whatever we want. Providing shims for ByteArrayOutput/input and Sink/Source (also for string-based IO) can then also move to core and core-io.

Just my 2 cents

JesusMcCloud · 2026-02-25T14:48:00Z

formats/cbor/commonMain/src/kotlinx/serialization/cbor/internal/Encoder.kt

-        structureStack.peek().bytes
+    private val structureStack = mutableListOf<Data>()
+    override fun getDestination(): Output =
+        structureStack.lastOrNull()?.bytes ?: output


This will interfere with #3036, but it's minimal. I'd of course still prefer my humongous PR to be finalized at at some point without adding even more layers to what is already challenging to review ;-)

Add CBOR kotlinx-io support

5f108b6

pdvrieze reviewed Feb 9, 2026

View reviewed changes

sandwwraith requested changes Feb 23, 2026

View reviewed changes

JesusMcCloud reviewed Feb 25, 2026

View reviewed changes

		public val availableBytes: Int
		/** Returns a -1 if no bytes are available. Otherwise returns a value between 0 and 255 (inclusive). */



		//value classes are only inlined on the JVM, so we use a typealias and extensions instead
		private typealias Stack = MutableList<CborWriter.Data>

Conversation

cedrickcooke commented Feb 8, 2026

Uh oh!

pdvrieze left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sandwwraith commented Feb 23, 2026

Uh oh!

sandwwraith left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants