Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Demo zero-copy in GrpcService, including GrpcClientProtocolService and GrpcServerProtocolService (appendEntries).
This PR is for an early review to get suggestions for correctly shaping the code.
Zero-copy is done by a simple trick, any protobuf's ByteString object parsed will refer to the original netty buffers instead of having a separated copy in heap. This helps avoid copying data to heap memory and thus saves the cost of buffer copy and GC (for intermediate heap buffers).
Yet, it comes with a challenge: The application needs to explicitly close the original netty buffers when it knows the original protobuf objects (and the it's descendant) is no longer needed. In Ratis, it means to decide when a LogEntryProto is no longer used.
Today, Ratis caches LogEntryProto in SegmentedRaftLogCache. However, for data-intensive applications like Apache Ozone, the cached log entries get their StateMachine data truncated and Ratis relies on the StateMachine to cache the StateMachine data. This behavior is defined by the config
raft.server.log.statemachine.data.caching.enabled
.This demo solves the cleanup problem by having DirectBufferCleaner that keeps track of all opening original buffers (handled by an InputStream interface). The cleaner is invoked when:
raft.server.log.statemachine.data.caching.enabled
, because the log size with StateMachine data truncated doesn't reflect the right size of the original buffer, and this defer cache eviction. We need another strategy for data-intensive StateMachine.A quick thought, as data-intensive StateMachine may cache data referring to the original buffers, we may need a new StateMachine API to tell when StateMachine should evict data (up to a particular index).
This demo also has a fix to avoid RaftId like (RaftPeerId, RaftGroupId) from referring to the original data source, because that is not zero-copy friendly. This fix will go as a separate PR.
And the code in this demo is not well-structure. For the easy of making the demo, I put DirectBufferCleaner in ratis-server so that it can be invoked directly from ratis-server code. The component should be in ratis-grpc and get invoked based on subscribing events from RaftServer.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/RATIS-1925
https://issues.apache.org/jira/browse/RATIS-1934
How was this patch tested?
To be tested.