-
Notifications
You must be signed in to change notification settings - Fork 179
Compute WAL size and use it during retention size checks #651
base: master
Are you sure you want to change the base?
Conversation
wal/wal.go
Outdated
if first == -1 || last == -1 { | ||
return 0, fmt.Errorf("no segments found in WAL") | ||
} | ||
return int64((last - first + 1) * w.segmentSize), nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about if the last segment is not complete?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand, the WAL.Segments()
call returns the indices of the first and last segments on disk. Considering the fact that segment files are created on disk when they're initialised, means that the function will return the index of the current, incomplete segment as last
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is right and the last segment will not be the full w.segmentSize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I do not need to modify my WAL size calculations right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I mean here you assume that all segments are of size w.segmentSize
, but in reality the last segment is not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I've modified the code to now read the contents of the data/wal/
directory and add the sizes of all the files, including the checkpoints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I preffered the other way , just need to figure out how to calculate it for the active segment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When calculating the WAL size using the segments, it too scans the WAL dir and returns the names of the segment files, the size of which we assume to be w.segmentSize
, but as @brian-brazil said, its not necessary that each of the segments is fully occupied. This way we get the true occupied size on disk, which would be more accurate, and we wouldn't have to account for the active segments' pages sizes either.
Of course, I'm open to suggestions if there's a better way of accurately determining the size that you can think of!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@krasi-georgiev Were you able to take a look at an alternative approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not yet, probably next week. I haven't forgotten , just need to finish some other things I have started.
just had a quick look: the wal.Reader tracks the total bytes processed https://github.com/prometheus/tsdb/blob/b40cc43958a563aa06a73b2c8d9090e66109ad1b/wal/reader.go#L33. After loading the wal with head.Init you will have the total wal size so far on disk. after that in wal.Log can add to that total the additional added bytes so total wal size would be the size when loading the wal + any additional added samples with wal.Log @codesome @gouthamve what do you think? |
Will that still work with compression? |
I just double checked, and YES it will work as Reader.total tracks total bytes read from the disk and the decompression happens after that. |
and for the wal.Log can track the additional bytes that go in after this: |
Do you think it makes sense to update the WAL size after the pages have been flushed to disk as part of |
Why than and not when doing the wal.Log? |
There could be a possibility that a page cannot be flushed to disk, in which case the byte count that we add (I assume) immediately after optionally compressing the write buffer would differ from the actual written count. Similarly, the actual number of bytes written could differ from the size of the buffer actually passed to the |
When a flush fails Prometheus will exit with an error so the next time it start the wal size will be populated correctly. This means that we shouldn't worry about failed flushed.
yes you are right here. Best to use https://github.com/prometheus/tsdb/blob/b40cc43958a563aa06a73b2c8d9090e66109ad1b/wal/wal.go#L471 |
Makes sense. And then we could subtract the size of newly created block from the WAL size, at this line: https://github.com/prometheus/tsdb/blob/7dd5e177aa89828b443e7a331610958d25dc8355/db.go#L459 |
aaah , wait checkpointing deletes wal files so need to look at the code to see how to handle that as well. |
I was unable to find a direct reference to the size of the chunks written to the block on disk. I was thinking maybe we could modify this function to return the number of chunk bytes written, the total number of bytes written (including meta, tombstone, index), and error? https://github.com/prometheus/tsdb/blob/7dd5e177aa89828b443e7a331610958d25dc8355/compact.go#L526 This way, we could pass it up the call chain. |
compaction hasn't got much relation to WAL size. |
The way that I understand it, the data is read from the |
I think there is always some duplicate data in the wal and last block. IIRC when loading the wal all samples after the maxt of the last block are ignored. |
Okay. Based on this fact, maybe we could adopt the current approach of just reading the |
There would be a way to handle all cases, but it would def be more complicated. |
@krasi-georgiev Gentle nudge :) |
Still waiting for some input from @bwplotka @codesome @gouthamve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks fine in general, but I am wondering if it would error out if we are calculating size while files are being added or removed. I haven't given much thought yet it, will have a look.
Looks like the Windows and Linux Travis builds failed due to some permission issues. |
restarted the tests. Why not remove the duplicate DirSize and use this one for the other tests as well? |
@krasi-georgiev I can do that once it passes review I suppose. |
Signed-off-by: Dipack P Panjabi <[email protected]>
Considering there was no more activity on this PR, I've followed @krasi-georgiev's suggestion and converted both the WAL and the test code to use a singular |
Signed-off-by: Dipack P Panjabi [email protected]
Compute the size of the WAL and use it while calculating if exceeding the max retention size limit.
(PR #650 broke somehow, and so this is a new one with the same changes)