Skip to content

Commit

Permalink
Use setRange when copying output chunks to the final buffer in `Cod…
Browse files Browse the repository at this point in the history
…edBufferWriter`

Similar to google#885, this optimizes some more buffer copying to `memcpy`.

Becuase the chunks can be very small, we use a simple loop as before on small
chunks and `setRange` on large chunks.

The chunk size to check for when to use a loop is somewhat arbitrarily chosen
as 20. In many benchmarks the chunks are either too small (less than 10 bytes)
or a more than a hundred bytes, so in the benchmarks I've tried it doesn't
matter whether the threshold is 20, 30, or 40.

The slowness in dart2js is probably caused by the `Uint8List` allocation, to be
passed to `setRange` as the source.

Results from the same benchmark in google#885. Before numbers are using the changes
in google#886.

|                              | Before     | After      | Diff                |
|------------------------------|------------|------------|---------------------|
| AOT                          | 122,741 us | 114,582 us | - 8,159 us, -6.6%   |
| JIT                          |  94,880 us |  92,317 us | - 2,483 us, -2.6%   |
| dart2js -O4                  | 258,250 us | 266,000 us | + 7,750 us, +3.0%   |
| dart2wasm --omit-type-checks | 195,300 us | 169,166 us | -26,134 us, -13.3%  |

AOT and JIT tested on x64.
  • Loading branch information
osa1 committed Oct 24, 2023
1 parent 050c162 commit fb82ad6
Showing 1 changed file with 16 additions and 6 deletions.
22 changes: 16 additions & 6 deletions protobuf/lib/src/protobuf/coded_buffer_writer.dart
Original file line number Diff line number Diff line change
Expand Up @@ -141,22 +141,32 @@ class CodedBufferWriter {
}
buffer[outPos++] = v;
} else {
// action is an amount of bytes to copy from _outputChunks into the
// buffer.
// `action` is an amount of bytes to copy from `_outputChunks` into
// the buffer.
var bytesToCopy = action;
while (bytesToCopy > 0) {
final Uint8List chunk = _outputChunks[chunkIndex];
final int bytesInChunk = _outputChunks[chunkIndex + 1];

// Copy at most bytesToCopy bytes from the current chunk.
// Copy at most `bytesToCopy` bytes from the current chunk.
final leftInChunk = bytesInChunk - chunkPos;
final bytesToCopyFromChunk =
leftInChunk > bytesToCopy ? bytesToCopy : leftInChunk;
final endPos = chunkPos + bytesToCopyFromChunk;
while (chunkPos < endPos) {
buffer[outPos++] = chunk[chunkPos++];

if (bytesToCopyFromChunk <= 20) {
while (chunkPos < endPos) {
buffer[outPos++] = chunk[chunkPos++];
}
bytesToCopy -= bytesToCopyFromChunk;
} else {
final chunkSlice = Uint8List.sublistView(chunk, chunkPos, endPos);
buffer.setRange(
outPos, outPos + bytesToCopyFromChunk, chunkSlice);
chunkPos += bytesToCopyFromChunk;
outPos += bytesToCopyFromChunk;
bytesToCopy -= bytesToCopyFromChunk;
}
bytesToCopy -= bytesToCopyFromChunk;

// Move to the next chunk if the current one is exhausted.
if (chunkPos == bytesInChunk) {
Expand Down

0 comments on commit fb82ad6

Please sign in to comment.