Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add streaming deserialization support for kotlinx.serialization.json #4166

Open
wants to merge 7 commits into
base: trunk
Choose a base branch
from

Conversation

0neel
Copy link

@0neel 0neel commented Jun 8, 2024

Follows up JakeWharton/retrofit2-kotlinx-serialization-converter#43 by introducing a new ConverterFactory specific for the Json format.

It uses Okio-based streaming API to deserialize responses.

@ExperimentalSerializationApi
class SerializerFromJson(override val format: Json) : Serializer() {
override fun <T> fromResponseBody(loader: DeserializationStrategy<T>, body: ResponseBody): T {
val stream = body.byteStream()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This incurs a lot of copies. It's probably better to depend on the Okio version of the json artifact and use the Okio-based steaming APIs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Migrated to the Okio-base APIs. Could you please re-check?

@@ -12,7 +12,7 @@ import okhttp3.MediaType
import okhttp3.RequestBody
import okhttp3.ResponseBody

internal sealed class Serializer {
abstract class Serializer {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep this and Factory internal to this module. We don't need to share anything—the implementation isn't very big and we don't need as much indirection.

Copy link
Author

@0neel 0neel Jun 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, created module-specific Factory and converters implementations.

@0neel 0neel force-pushed the kotlinx_json_streaming branch from 80bf187 to 0941c81 Compare June 10, 2024 19:47
@0neel 0neel requested a review from JakeWharton June 10, 2024 21:10
@0neel
Copy link
Author

0neel commented Jun 24, 2024

@JakeWharton is there anything else I can do to proceed with the PR?

@JakeWharton
Copy link
Collaborator

JakeWharton commented Jun 25, 2024

Sorry I haven't forgotten about this. I was waffling on what I wanted to do based on this comment: Kotlin/kotlinx.serialization#253 (comment). Specifically,

So the only option I see here is to eventually make kotlinx-io Source/Sink the default choice, and to make byte array adapters that would simply consume/write to it fully. It is the same non-ideal situation as before, but I think for real-world applications (writing to file, sending to network), IO-first implementation is preferable.

@JakeWharton
Copy link
Collaborator

I guess in that case we can simply deprecate this converter and add the streaming to the normal one, assuming the APIs make themselves available on the high-level "format"-suffixed types...

@0neel
Copy link
Author

0neel commented Jun 25, 2024

I guess in that case we can simply deprecate this converter and add the streaming to the normal one

Sounds reasonable. Until then we could use this Json-specific implementation.
Let me know if there is anything I should do before this PR can be merged.

@JakeWharton
Copy link
Collaborator

Yep I'll get it in here soon. Probably Monday

@janseeger
Copy link

Any news? This hasn't been merged, nor is the streaming API in the regular KotlinX Serialization Converter.

@p-dmitriy
Copy link

@JakeWharton is there plans to merge this?

@cbeyls
Copy link

cbeyls commented Nov 27, 2024

The Okio-based API of Kotlinx Serialization has terrible performance (more than 2 times slower than String last time I checked) because it basically decodes one codepoint at a time; I would advise not to use it and use the java InputStream API instead for real gains.

@JakeWharton
Copy link
Collaborator

Streaming is always slower than buffered, yes.

The point of using the streaming APIs is that you get to amortize the cost of serialization and deserialization across the cost of the network transfer, which is orders of magnitude slower.

So instead of paying 100ms to serialize and 1000ms to transfer you effectively only pay 1000ms with streaming. Whether serialization took 100ms, 200ms, or 500ms becomes irrelevant because the limiting factor is always the network.

@cbeyls
Copy link

cbeyls commented Nov 27, 2024

Why not pay even less, especially when reading the response back from the disk cache?
Since Retrofit is a Java-only project, there is no requirement to use that API.

@JakeWharton
Copy link
Collaborator

Okio costs less than InputStream when streaming as there are no array copies across layers.

@cbeyls
Copy link

cbeyls commented Nov 27, 2024

Okio costs less than InputStream when streaming as there are no array copies across layers.

It costs less when performing comparable operations. But last time I checked, it takes much more CPU time to decode chars one by one from a BufferedSource than bulk copy a byte array then bulk decode it to a char array. Also, the Kotlinx serialization InputStream implementation recycles its byte arrays in a pool similarly to Okio, so it's quite well optimized.

See related issue in the Kotlin Serialization project: Kotlin/kotlinx.serialization#2743

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants