Compute WAL size and use it during retention size checks #651

dipack95 · 2019-07-03T15:33:26Z

Signed-off-by: Dipack P Panjabi [email protected]

Compute the size of the WAL and use it while calculating if exceeding the max retention size limit.

(PR #650 broke somehow, and so this is a new one with the same changes)

krasi-georgiev · 2019-07-03T18:41:08Z

wal/wal.go

+	if first == -1 || last == -1 {
+		return 0, fmt.Errorf("no segments found in WAL")
+	}
+	return int64((last - first + 1) * w.segmentSize), nil


what about if the last segment is not complete?

As far as I understand, the WAL.Segments() call returns the indices of the first and last segments on disk. Considering the fact that segment files are created on disk when they're initialised, means that the function will return the index of the current, incomplete segment as last.

that is right and the last segment will not be the full w.segmentSize

So I do not need to modify my WAL size calculations right?

No I mean here you assume that all segments are of size w.segmentSize, but in reality the last segment is not.

Okay, I've modified the code to now read the contents of the data/wal/ directory and add the sizes of all the files, including the checkpoints.

I preffered the other way , just need to figure out how to calculate it for the active segment.

When calculating the WAL size using the segments, it too scans the WAL dir and returns the names of the segment files, the size of which we assume to be w.segmentSize, but as @brian-brazil said, its not necessary that each of the segments is fully occupied. This way we get the true occupied size on disk, which would be more accurate, and we wouldn't have to account for the active segments' pages sizes either.

Of course, I'm open to suggestions if there's a better way of accurately determining the size that you can think of!

@krasi-georgiev Were you able to take a look at an alternative approach?

not yet, probably next week. I haven't forgotten , just need to finish some other things I have started.

krasi-georgiev · 2019-07-16T14:44:10Z

just had a quick look:

the wal.Reader tracks the total bytes processed https://github.com/prometheus/tsdb/blob/b40cc43958a563aa06a73b2c8d9090e66109ad1b/wal/reader.go#L33.

After loading the wal with head.Init you will have the total wal size so far on disk.

after that in wal.Log can add to that total the additional added bytes

so total wal size would be the size when loading the wal + any additional added samples with wal.Log

@codesome @gouthamve what do you think?

brian-brazil · 2019-07-16T15:00:33Z

Will that still work with compression?

krasi-georgiev · 2019-07-16T15:05:26Z

I just double checked, and YES it will work as Reader.total tracks total bytes read from the disk and the decompression happens after that.

krasi-georgiev · 2019-07-16T15:08:34Z

and for the wal.Log can track the additional bytes that go in after this:
https://github.com/prometheus/tsdb/blob/b40cc43958a563aa06a73b2c8d9090e66109ad1b/wal/wal.go#L571-L581

dipack95 · 2019-07-16T15:28:27Z

Do you think it makes sense to update the WAL size after the pages have been flushed to disk as part of wal.flushPage(..)?

krasi-georgiev · 2019-07-16T16:49:32Z

Why than and not when doing the wal.Log?

dipack95 · 2019-07-16T17:00:18Z

There could be a possibility that a page cannot be flushed to disk, in which case the byte count that we add (I assume) immediately after optionally compressing the write buffer would differ from the actual written count. Similarly, the actual number of bytes written could differ from the size of the buffer actually passed to the wal.segment.Write(..) call.

krasi-georgiev · 2019-07-17T09:30:58Z

There could be a possibility that a page cannot be flushed to disk, in which case the byte count that we add (I assume) immediately after optionally compressing the write buffer would differ from the actual written count.

When a flush fails Prometheus will exit with an error so the next time it start the wal size will be populated correctly. This means that we shouldn't worry about failed flushed.

Similarly, the actual number of bytes written could differ from the size of the buffer actually passed to the wal.segment.Write(..) call.

yes you are right here. Best to use n to populate the total written to disk.

https://github.com/prometheus/tsdb/blob/b40cc43958a563aa06a73b2c8d9090e66109ad1b/wal/wal.go#L471

dipack95 · 2019-07-17T13:30:44Z

When a flush fails Prometheus will exit with an error so the next time it start the wal size will be populated correctly. This means that we shouldn't worry about failed flushed.

Makes sense.

And then we could subtract the size of newly created block from the WAL size, at this line:

https://github.com/prometheus/tsdb/blob/7dd5e177aa89828b443e7a331610958d25dc8355/db.go#L459

krasi-georgiev · 2019-07-17T13:41:18Z

aaah , wait checkpointing deletes wal files so need to look at the code to see how to handle that as well.

dipack95 · 2019-07-17T13:59:12Z

I was unable to find a direct reference to the size of the chunks written to the block on disk. I was thinking maybe we could modify this function to return the number of chunk bytes written, the total number of bytes written (including meta, tombstone, index), and error?

https://github.com/prometheus/tsdb/blob/7dd5e177aa89828b443e7a331610958d25dc8355/compact.go#L526

This way, we could pass it up the call chain.

krasi-georgiev · 2019-07-17T14:23:33Z

compaction hasn't got much relation to WAL size.
I think the wal checkpointing is the only blocker here.

dipack95 · 2019-07-17T14:33:13Z

The way that I understand it, the data is read from the head (and in turn the WAL) using the same compact(..) method, which we must take into account as it removing data from the WAL and writing it to persistent storage. Or have I misunderstood the structure?

krasi-georgiev · 2019-07-17T14:40:06Z

I think there is always some duplicate data in the wal and last block. IIRC when loading the wal all samples after the maxt of the last block are ignored.

dipack95 · 2019-07-17T15:10:38Z

Okay. Based on this fact, maybe we could adopt the current approach of just reading the data/wal/ directory and using its size? As it currently seems like there a lot of forces in play when it comes to interacting with the WAL.

krasi-georgiev · 2019-07-19T14:43:13Z

There would be a way to handle all cases, but it would def be more complicated.
I am not strongly against it the DirSize(.. approach and would like to hear what @bwplotka @codesome @gouthamve think.

dipack95 · 2019-08-01T19:52:22Z

@krasi-georgiev Gentle nudge :)

krasi-georgiev · 2019-08-02T03:10:08Z

Still waiting for some input from @bwplotka @codesome @gouthamve

codesome

It looks fine in general, but I am wondering if it would error out if we are calculating size while files are being added or removed. I haven't given much thought yet it, will have a look.

wal/wal.go

dipack95 · 2019-08-02T15:33:07Z

Looks like the Windows and Linux Travis builds failed due to some permission issues.

krasi-georgiev · 2019-08-05T21:51:46Z

restarted the tests.

Why not remove the duplicate DirSize and use this one for the other tests as well?

dipack95 · 2019-08-06T14:00:59Z

@krasi-georgiev I can do that once it passes review I suppose.

Signed-off-by: Dipack P Panjabi <[email protected]>

dipack95 · 2019-08-09T14:35:47Z

Considering there was no more activity on this PR, I've followed @krasi-georgiev's suggestion and converted both the WAL and the test code to use a singular DirSize(..) impl. :)

krasi-georgiev reviewed Jul 3, 2019

View reviewed changes

codesome mentioned this pull request Jul 17, 2019

storage.tsdb.retention.size exceeds storage prometheus/prometheus#5771

Closed

codesome reviewed Aug 2, 2019

View reviewed changes

wal/wal.go Outdated Show resolved Hide resolved

Compute WAL size and use it during retention size checks

b2b41f3

Signed-off-by: Dipack P Panjabi <[email protected]>

dipack95 mentioned this pull request Aug 13, 2019

Compute WAL size and use it during retention size checks prometheus/prometheus#5886

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compute WAL size and use it during retention size checks #651

Compute WAL size and use it during retention size checks #651

dipack95 commented Jul 3, 2019

krasi-georgiev Jul 3, 2019

dipack95 Jul 5, 2019

krasi-georgiev Jul 5, 2019

dipack95 Jul 5, 2019

krasi-georgiev Jul 5, 2019

dipack95 Jul 5, 2019

krasi-georgiev Jul 5, 2019

dipack95 Jul 5, 2019 •

edited

Loading

dipack95 Jul 9, 2019

krasi-georgiev Jul 9, 2019

krasi-georgiev commented Jul 16, 2019

brian-brazil commented Jul 16, 2019

krasi-georgiev commented Jul 16, 2019

krasi-georgiev commented Jul 16, 2019

dipack95 commented Jul 16, 2019

krasi-georgiev commented Jul 16, 2019

dipack95 commented Jul 16, 2019

krasi-georgiev commented Jul 17, 2019

dipack95 commented Jul 17, 2019

krasi-georgiev commented Jul 17, 2019

dipack95 commented Jul 17, 2019

krasi-georgiev commented Jul 17, 2019

dipack95 commented Jul 17, 2019

krasi-georgiev commented Jul 17, 2019

dipack95 commented Jul 17, 2019

krasi-georgiev commented Jul 19, 2019 •

edited

Loading

dipack95 commented Aug 1, 2019

krasi-georgiev commented Aug 2, 2019

codesome left a comment

dipack95 commented Aug 2, 2019

krasi-georgiev commented Aug 5, 2019

dipack95 commented Aug 6, 2019

dipack95 commented Aug 9, 2019

Compute WAL size and use it during retention size checks #651

Are you sure you want to change the base?

Compute WAL size and use it during retention size checks #651

Conversation

dipack95 commented Jul 3, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dipack95 Jul 5, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

krasi-georgiev commented Jul 16, 2019

brian-brazil commented Jul 16, 2019

krasi-georgiev commented Jul 16, 2019

krasi-georgiev commented Jul 16, 2019

dipack95 commented Jul 16, 2019

krasi-georgiev commented Jul 16, 2019

dipack95 commented Jul 16, 2019

krasi-georgiev commented Jul 17, 2019

dipack95 commented Jul 17, 2019

krasi-georgiev commented Jul 17, 2019

dipack95 commented Jul 17, 2019

krasi-georgiev commented Jul 17, 2019

dipack95 commented Jul 17, 2019

krasi-georgiev commented Jul 17, 2019

dipack95 commented Jul 17, 2019

krasi-georgiev commented Jul 19, 2019 • edited Loading

dipack95 commented Aug 1, 2019

krasi-georgiev commented Aug 2, 2019

codesome left a comment

Choose a reason for hiding this comment

dipack95 commented Aug 2, 2019

krasi-georgiev commented Aug 5, 2019

dipack95 commented Aug 6, 2019

dipack95 commented Aug 9, 2019

dipack95 Jul 5, 2019 •

edited

Loading

krasi-georgiev commented Jul 19, 2019 •

edited

Loading