Hi, first of all thank you for sharing your code!
I was quite impressed by the reading speed, and I think that I found a way to make this part a bit faster:
|
// spawn a goroutine to read file in chunks and send it to the chunk channel for further processing |
|
go func() { |
|
buf := make([]byte, chunkSize) |
|
leftover := make([]byte, 0, chunkSize) |
|
for { |
|
readTotal, err := file.Read(buf) |
|
if err != nil { |
|
if errors.Is(err, io.EOF) { |
|
break |
|
} |
|
panic(err) |
|
} |
|
buf = buf[:readTotal] |
|
|
|
toSend := make([]byte, readTotal) |
|
copy(toSend, buf) |
|
|
|
lastNewLineIndex := bytes.LastIndex(buf, []byte{'\n'}) |
|
|
|
toSend = append(leftover, buf[:lastNewLineIndex+1]...) |
|
leftover = make([]byte, len(buf[lastNewLineIndex+1:])) |
|
copy(leftover, buf[lastNewLineIndex+1:]) |
|
|
|
chunkStream <- toSend |
|
|
Currently each loop calls 2 times make([]byte) and copy. This could be reduced to 1, by storing the leftover as a length, instead of a byte slice:
buf := make([]byte, chunkSize)
leftover := 0
for {
n, err := file.Read(buf[leftover:]) // append to the leftover
if err != nil {
if errors.Is(err, io.EOF) {
break
}
panic(err)
}
toSend := buf[:leftover+n]
lastNewLineIndex := bytes.LastIndexByte(toSend, '\n')
buf = make([]byte, chunkSize) // prepare a new buffer for next read
leftover = copy(buf, toSend[lastNewLineIndex+1:])
chunkStream <- toSend[:lastNewLineIndex+1]
}
On a sample file, this code is about 10% faster than the current version
Hi, first of all thank you for sharing your code!
I was quite impressed by the reading speed, and I think that I found a way to make this part a bit faster:
1brc/main.go
Lines 131 to 155 in 8513d5e
Currently each loop calls 2 times
make([]byte)andcopy. This could be reduced to 1, by storing the leftover as a length, instead of a byte slice:On a sample file, this code is about 10% faster than the current version