-
Notifications
You must be signed in to change notification settings - Fork 133
hashdb cap and commit batch size config options #452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…ns; add pebble batch size safety checks
// The max batch size is limited by the uint32 offsets stored in | ||
// internal/batchskl.node, DeferredBatchOp, and flushableBatchEntry. | ||
// | ||
// Pebble limits the size to MaxUint32 (just short of 4GB) so that the exclusive | ||
// end of an allocation fits in uint32. | ||
// | ||
// On 32-bit systems, slices are naturally limited to MaxInt (just short of | ||
// 2GB). | ||
// see: cockroachdb/pebble.maxBatchSize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment was copied and pasted from cockroachdb/pebble repo.
It is a bit confusing here though, e.g., DeferredBatchOp is defined in cockroachdb/pebble, and geth doesn't reference it directly.
I had a "hard time" to find about this.
I didn't find internal/batchskl.node even in cockroachdb/pebble.
// On 32-bit systems, slices are naturally limited to MaxInt (just short of | ||
// 2GB). | ||
// see: cockroachdb/pebble.maxBatchSize | ||
maxBatchSize = (1<<31)<<(^uint(0)>>63) - 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was inspired from cockroachdb/pebble, but it is quite difficult to understand why it is defined the way it is here.
How about adding the comment that cockroachdb/pebble has here:
// MaxUint32OrInt returns
// on eIf64Bit is 1 on 64-bit platforms and 0 on 32-bit platforms.
oneIf64Bit = ^uint(0) >> 63
// MaxUint32OrInt returns min(MaxUint32, MaxInt), i.e
// - MaxUint32 on 64-bit platforms;
// - MaxInt on 32-bit platforms.
// It is used when slices are limited to Uint32 on 64-bit platforms (the
// length limit for slices is naturally MaxInt on 32-bit platforms).
MaxUint32OrInt = (1<<31)<<oneIf64Bit - 1
And also explicitly comment that it was inspired by cockroachdb/pebble
func (b *batch) Put(key, value []byte) error { | ||
// The size increase is argument in call to cockroachdb/pebble.Batch.grow in cockroachdb/pebble.Batch.prepareDeferredKeyValueRecord. pebble.Batch.grow may panic if the batch data size plus the increase reaches cockroachdb/pebble.maxBatchSize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// The size increase is argument in call to cockroachdb/pebble.Batch.grow in cockroachdb/pebble.Batch.prepareDeferredKeyValueRecord. pebble.Batch.grow may panic if the batch data size plus the increase reaches cockroachdb/pebble.maxBatchSize | |
// The size increase is an argument to the cockroachdb/pebble.Batch.grow call in cockroachdb/pebble.Batch.prepareDeferredKeyValueRecord. pebble.Batch.grow may panic if the batch data size plus the increase reaches cockroachdb/pebble.maxBatchSize |
func (b *batch) Put(key, value []byte) error { | ||
// The size increase is argument in call to cockroachdb/pebble.Batch.grow in cockroachdb/pebble.Batch.prepareDeferredKeyValueRecord. pebble.Batch.grow may panic if the batch data size plus the increase reaches cockroachdb/pebble.maxBatchSize | ||
sizeIncrease := 1 + uint64(2*binary.MaxVarintLen32) + uint64(len(key)) + uint64(len(value)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the comment before this line, but I don't understand why we need to add 1 + uint64(2*binary.MaxVarintLen32)
to sizeIncrease.
I mean, I don't understand how a batch is encoded to be written to the DB, so I don't understand why we compute this sizeIncrease in the way as it is being computed.
// The size increase is argument in call to cockroachdb/pebble.Batch.grow in cockroachdb/pebble.Batch.prepareDeferredKeyRecord. pebble.Batch.grow may panic if the batch data size plus the increase reaches cockroachdb/pebble.maxBatchSize | ||
sizeIncrease := 1 + uint64(binary.MaxVarintLen32) + uint64(len(key)) | ||
// check if we fit within maxBatchSize | ||
if uint64(b.b.Len())+sizeIncrease >= maxBatchSize { | ||
// return an error instead of letting b.b.Delete to panic | ||
return ethdb.ErrBatchTooLarge | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This same code block is being defined in batch.Put, how about creating a function to abstract that?
CleanCacheSize int // Maximum memory allowance (in bytes) for caching clean nodes | ||
} | ||
|
||
// Defaults is the default setting for database if it's not specified. | ||
// Notably, clean cache is disabled explicitly, | ||
var Defaults = &Config{ | ||
// Arbitrum: | ||
// default zeroes used to prevent need for correct initialization in all places used upstream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I correctly understand this comment.
But by default Config will be initialized with zero IdealCapBatchSize without explicitly setting it to zero.
if errors.Is(err, ethdb.ErrBatchTooLarge) { | ||
log.Warn("Pebble batch limit reached in hashdb Cap operation, flushing batch. Consider setting ideal cap batch size to a lower value.", "pebbleError", err) | ||
// flush batch & retry the write | ||
if err = batch.Write(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a log.Error("Failed to write flush list to disk", "err", err) here before returning?
Error returned by batch.Write(), called in the next code block, has a log.Error before returning the error.
idealBatchSize := uint(db.config.IdealCapBatchSize) | ||
if idealBatchSize == 0 { | ||
idealBatchSize = uint(ethdb.IdealBatchSize) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of checking this in every Cap and Commit calls, how about setting db.config.IdealCapBatchSize to ethdb.IdealBatchSize, in case TrieCapBatchSize is zero, when creating db.config?
This PR:
core.CacheConfig
:TrieCapBatchSize
- write batch size threshold used during capping triedb sizeTrieCommitBatchSize
- write batch size threshold used during committing triedb to diskethdb/pebble.batch.Put
andethdb/pebble.batch.Delete
to return an error instead of getting panic from the internals of pebblepart of NIT-3204