hashdb cap and commit batch size config options #452

magicxyyz · 2025-05-07T21:23:25Z

This PR:

adds two new config options to core.CacheConfig :
- TrieCapBatchSize - write batch size threshold used during capping triedb size
- TrieCommitBatchSize - write batch size threshold used during committing triedb to disk
adds internal pebble batch size checks in ethdb/pebble.batch.Put and ethdb/pebble.batch.Delete to return an error instead of getting panic from the internals of pebble

part of NIT-3204

…ns; add pebble batch size safety checks

diegoximenes · 2025-05-28T14:38:03Z

ethdb/pebble/pebble.go

+	// The max batch size is limited by the uint32 offsets stored in
+	// internal/batchskl.node, DeferredBatchOp, and flushableBatchEntry.
+	//
+	// Pebble limits the size to MaxUint32 (just short of 4GB) so that the exclusive
+	// end of an allocation fits in uint32.
+	//
+	// On 32-bit systems, slices are naturally limited to MaxInt (just short of
+	// 2GB).
+	// see: cockroachdb/pebble.maxBatchSize


This comment was copied and pasted from cockroachdb/pebble repo.
It is a bit confusing here though, e.g., DeferredBatchOp is defined in cockroachdb/pebble, and geth doesn't reference it directly.
I had a "hard time" to find about this.
I didn't find internal/batchskl.node even in cockroachdb/pebble.

diegoximenes · 2025-05-28T14:42:02Z

ethdb/pebble/pebble.go

+	// On 32-bit systems, slices are naturally limited to MaxInt (just short of
+	// 2GB).
+	// see: cockroachdb/pebble.maxBatchSize
+	maxBatchSize = (1<<31)<<(^uint(0)>>63) - 1


This was inspired from cockroachdb/pebble, but it is quite difficult to understand why it is defined the way it is here.

How about adding the comment that cockroachdb/pebble has here:

// MaxUint32OrInt returns // on eIf64Bit is 1 on 64-bit platforms and 0 on 32-bit platforms. oneIf64Bit = ^uint(0) >> 63 // MaxUint32OrInt returns min(MaxUint32, MaxInt), i.e // - MaxUint32 on 64-bit platforms; // - MaxInt on 32-bit platforms. // It is used when slices are limited to Uint32 on 64-bit platforms (the // length limit for slices is naturally MaxInt on 32-bit platforms). MaxUint32OrInt = (1<<31)<<oneIf64Bit - 1

And also explicitly comment that it was inspired by cockroachdb/pebble

diegoximenes · 2025-05-28T14:49:54Z

ethdb/pebble/pebble.go

 func (b *batch) Put(key, value []byte) error {
+	// The size increase is argument in call to cockroachdb/pebble.Batch.grow in cockroachdb/pebble.Batch.prepareDeferredKeyValueRecord. pebble.Batch.grow may panic if the batch data size plus the increase reaches cockroachdb/pebble.maxBatchSize


Suggested change

// The size increase is argument in call to cockroachdb/pebble.Batch.grow in cockroachdb/pebble.Batch.prepareDeferredKeyValueRecord. pebble.Batch.grow may panic if the batch data size plus the increase reaches cockroachdb/pebble.maxBatchSize

// The size increase is an argument to the cockroachdb/pebble.Batch.grow call in cockroachdb/pebble.Batch.prepareDeferredKeyValueRecord. pebble.Batch.grow may panic if the batch data size plus the increase reaches cockroachdb/pebble.maxBatchSize

diegoximenes · 2025-05-28T14:52:39Z

ethdb/pebble/pebble.go

 func (b *batch) Put(key, value []byte) error {
+	// The size increase is argument in call to cockroachdb/pebble.Batch.grow in cockroachdb/pebble.Batch.prepareDeferredKeyValueRecord. pebble.Batch.grow may panic if the batch data size plus the increase reaches cockroachdb/pebble.maxBatchSize
+	sizeIncrease := 1 + uint64(2*binary.MaxVarintLen32) + uint64(len(key)) + uint64(len(value))


I understand the comment before this line, but I don't understand why we need to add 1 + uint64(2*binary.MaxVarintLen32)to sizeIncrease.
I mean, I don't understand how a batch is encoded to be written to the DB, so I don't understand why we compute this sizeIncrease in the way as it is being computed.

diegoximenes · 2025-05-28T15:01:37Z

ethdb/pebble/pebble.go

+	// The size increase is argument in call to cockroachdb/pebble.Batch.grow in cockroachdb/pebble.Batch.prepareDeferredKeyRecord. pebble.Batch.grow may panic if the batch data size plus the increase reaches cockroachdb/pebble.maxBatchSize
+	sizeIncrease := 1 + uint64(binary.MaxVarintLen32) + uint64(len(key))
+	// check if we fit within maxBatchSize
+	if uint64(b.b.Len())+sizeIncrease >= maxBatchSize {
+		// return an error instead of letting b.b.Delete to panic
+		return ethdb.ErrBatchTooLarge
+	}


This same code block is being defined in batch.Put, how about creating a function to abstract that?

diegoximenes · 2025-05-28T15:08:43Z

triedb/hashdb/database.go

 	CleanCacheSize int // Maximum memory allowance (in bytes) for caching clean nodes
 }

 // Defaults is the default setting for database if it's not specified.
 // Notably, clean cache is disabled explicitly,
 var Defaults = &Config{
+	// Arbitrum:
+	// default zeroes used to prevent need for correct initialization in all places used upstream


Not sure if I correctly understand this comment.
But by default Config will be initialized with zero IdealCapBatchSize without explicitly setting it to zero.

diegoximenes · 2025-05-28T15:16:59Z

triedb/hashdb/database.go

+			if errors.Is(err, ethdb.ErrBatchTooLarge) {
+				log.Warn("Pebble batch limit reached in hashdb Cap operation, flushing batch. Consider setting ideal cap batch size to a lower value.", "pebbleError", err)
+				// flush batch & retry the write
+				if err = batch.Write(); err != nil {


Should we add a log.Error("Failed to write flush list to disk", "err", err) here before returning?
Error returned by batch.Write(), called in the next code block, has a log.Error before returning the error.

diegoximenes · 2025-05-28T15:28:24Z

triedb/hashdb/database.go

+	idealBatchSize := uint(db.config.IdealCapBatchSize)
+	if idealBatchSize == 0 {
+		idealBatchSize = uint(ethdb.IdealBatchSize)
+	}


Instead of checking this in every Cap and Commit calls, how about setting db.config.IdealCapBatchSize to ethdb.IdealBatchSize, in case TrieCapBatchSize is zero, when creating db.config?

add configs for db batch sizes used by hashdb Cap and Commit operatio…

80aa88e

…ns; add pebble batch size safety checks

cla-bot bot added the s CLA signed label May 7, 2025

magicxyyz mentioned this pull request May 7, 2025

add config options for db batch sizes used by triedb OffchainLabs/nitro#3221

Open

magicxyyz and others added 3 commits May 8, 2025 19:55

rename pebble batch test

67e5612

Merge branch 'master' into hashdb-batch-sizes

474589e

improve pebble batch limit reached warnings

0e2aea9

magicxyyz self-assigned this May 23, 2025

Merge branch 'master' into hashdb-batch-sizes

f083a27

magicxyyz marked this pull request as ready for review May 27, 2025 21:01

magicxyyz requested review from Tristan-Wilson, diegoximenes and amsanghi May 27, 2025 21:01

magicxyyz removed their assignment May 27, 2025

magicxyyz marked this pull request as draft May 27, 2025 21:30

move ErrBatchTooLarge to ethdb, fixes wasm build

97bc243

magicxyyz marked this pull request as ready for review May 28, 2025 12:58

magicxyyz assigned diegoximenes May 28, 2025

diegoximenes requested changes May 28, 2025

View reviewed changes

diegoximenes assigned magicxyyz and unassigned diegoximenes May 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

hashdb cap and commit batch size config options #452

hashdb cap and commit batch size config options #452

Uh oh!

magicxyyz commented May 7, 2025 •

edited

Loading

Uh oh!

diegoximenes May 28, 2025

Uh oh!

diegoximenes May 28, 2025

Uh oh!

diegoximenes May 28, 2025

Uh oh!

diegoximenes May 28, 2025

Uh oh!

diegoximenes May 28, 2025

Uh oh!

diegoximenes May 28, 2025

Uh oh!

diegoximenes May 28, 2025

Uh oh!

diegoximenes May 28, 2025

Uh oh!

Uh oh!

		func (b *batch) Put(key, value []byte) error {
		// The size increase is argument in call to cockroachdb/pebble.Batch.grow in cockroachdb/pebble.Batch.prepareDeferredKeyValueRecord. pebble.Batch.grow may panic if the batch data size plus the increase reaches cockroachdb/pebble.maxBatchSize

	// The size increase is argument in call to cockroachdb/pebble.Batch.grow in cockroachdb/pebble.Batch.prepareDeferredKeyValueRecord. pebble.Batch.grow may panic if the batch data size plus the increase reaches cockroachdb/pebble.maxBatchSize
	// The size increase is an argument to the cockroachdb/pebble.Batch.grow call in cockroachdb/pebble.Batch.prepareDeferredKeyValueRecord. pebble.Batch.grow may panic if the batch data size plus the increase reaches cockroachdb/pebble.maxBatchSize

hashdb cap and commit batch size config options #452

Are you sure you want to change the base?

hashdb cap and commit batch size config options #452

Uh oh!

Conversation

magicxyyz commented May 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

magicxyyz commented May 7, 2025 •

edited

Loading