Skip to content

Conversation

@nossrannug
Copy link

@nossrannug nossrannug commented Aug 26, 2025

Add configurable compression thresholds for Tonic

This PR adds runtime-configurable compression thresholds to improve performance for large message compression scenarios.
Changes:

  • Compression threshold: Messages smaller than TONIC_COMPRESSION_THRESHOLD (default 1024 bytes) are sent uncompressed
  • Spawn-blocking threshold: Messages larger than TONIC_SPAWN_BLOCKING_THRESHOLD can be compressed on a blocking thread pool to avoid blocking the async runtime

Configuration:
Set via environment variables:

  • TONIC_COMPRESSION_THRESHOLD=4096 # Skip compression for messages < 4KB
  • TONIC_SPAWN_BLOCKING_THRESHOLD=65536 # Use blocking threads for messages >= 64KB

Benefits:

  • Avoids unnecessary compression overhead for small messages
  • Prevents large message compression from blocking async tasks
  • Maintains backward compatibility (defaults preserve existing behavior)

Implementation notes:

  • Added conditional compilation to support builds with/without tokio and compression features
  • Compression task polling happens immediately after spawning for optimal waker registration
  • Tests updated to verify threshold behavior
  • Fixes performance issues with large gRPC message compression blocking the tokio runtime.

@nossrannug nossrannug force-pushed the conditional-stream-compression branch from 76a268b to 449c7cc Compare August 26, 2025 12:35
Messages smaller than 1024 bytes are no longer compressed even when
compression is enabled. This optimization avoids the overhead of
compression for small messages where the compression ratio would be
minimal or potentially increase the message size.

The threshold is set to 1024 bytes (matching UNCOMPRESSED_MIN_BODY_SIZE
used in tests). Messages below this threshold are sent uncompressed
with the compression flag unset in the gRPC header.
@nossrannug nossrannug force-pushed the conditional-stream-compression branch from 449c7cc to 7d51ce6 Compare October 2, 2025 12:44
When compressing large payloads we use spawn_blocking to move the job to a tread to not block the async worker thread
@nossrannug nossrannug force-pushed the conditional-stream-compression branch 2 times, most recently from 5aa700e to 14a3a64 Compare October 27, 2025 12:07
@nossrannug nossrannug force-pushed the conditional-stream-compression branch from 14a3a64 to 472812a Compare October 27, 2025 13:04
@LucioFranco
Copy link
Member

@nossrannug hi, sorry for the delay on this. Do you have any background on if this is something other implementations of gRPC do? I'd ideally not like to diverge to much from them.

@dfawley
Copy link
Collaborator

dfawley commented Nov 6, 2025

This is not a feature any other gRPC library offers. It has been discussed, but never implemented. We do provide the ability to selectively compress messages on the stream -- except for Go, which is missing it. We also provide the ability to control compression on a per-stream basis in all languages.

I wouldn't be opposed to Tonic having this feature, but the new grpc-rust channel will not provide it unless it's a feature that goes through the gRFC process (see https://github.com/grpc/proposal).

@nossrannug
Copy link
Author

@LucioFranco , thanks for the comment. I didn't check how others do this before making this change. I just thought of the way Starlette does the gzip middleware and went with that assuming that was the way 😄

The main driver behind this PR is that I have a server stream rpc that first sends a large state to clients and then after will send individual messages (these are very small) with incremental updates as the state changes and also occasional sends empty messages.
After enabling compression in tonic the load on the service increased and the throughput went down as we were applying compression to very small payloads which just made them bigger. This just means that we're spending CPU power compressing a payload that gets larger and then send the larger payload.

I did just now check and the C++ core library will always compress a message (when client accepts compression) and discard the output if it's larger than the input to send the input uncompressed. And the server implementation controls the per message compression by using WriteOptions.

class MyServiceImpl final : public ExampleService::Service {
 public:
  Status StreamThings(ServerContext* context,
                      const StreamRequest* request,
                      ServerWriter<StreamReply>* writer) override {
    StreamReply reply;

    // Send first message compressed (default)
    reply.set_message("large_payload_data");
    writer->Write(reply);  // normal compression applies

    // Send second message uncompressed
    reply.set_message("small_ping");
    WriteOptions opts;
    opts.set_no_compression();  // disable compression for this message only
    writer->Write(reply, opts);

    return Status::OK;
  }
};

If we want to follow the cpp implementation I can update the PR to:

  • remove env vars and the compression threshold check
  • dismiss the compressed payload if it's larger than the uncompressed payload and send uncompressed
  • find a way so the server can control compression per message before sending it (I don't think tonic has that today for server streaming)

Does that sounds as a reasonable plan?

@nossrannug
Copy link
Author

@dfawley , just now noticed your comment. Thanks for joining the conversation. I'd be more than happy to take this through all the right channels. I'll take a look at https://github.com/grpc/proposal and see if I can't get the ball rolling.

@dfawley
Copy link
Collaborator

dfawley commented Nov 6, 2025

@nossrannug if you are going to broach this issue, then be aware that there has been prior discussion on the topic. There is probably some appetite for a feature like this, but there might be folks that want it to be more complicated -- or be extensible so it could become more complicated later -- than simply using the size of the message to be sent. There are probably several other things to debate, like whether it's a client/server setting or whether it should be part of the service config (which tonic doesn't support), whether it should be on by default, etc.

In the meantime, I think if you can find a way to add per-message configuration of compression in tonic, that would be ideal as it would match the other grpc implementations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants