Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

db: consider adding IterContext #4115

Open
jbowens opened this issue Oct 26, 2024 · 1 comment
Open

db: consider adding IterContext #4115

jbowens opened this issue Oct 26, 2024 · 1 comment

Comments

@jbowens
Copy link
Collaborator

jbowens commented Oct 26, 2024

The iterator stack has a host of configuration and global state. Today a lot of this configuration exists outside the base.InternalIterator interface. This forces iterator construction to propagate this configuration, every internal iterator in the stack to save a copy, and every internal iterator to zero this state on Close. Iterator.Close has an unfortunately high cost on workloads that use an iterator for a one-time seek (#4049) in part due to zeroing these structures.

We might reduce CPU and reduce the memory used by an iterator by propagating some of this information down the iterator stack during an operation by pointer. For example, we could define a base.IterContext struct:

type IterContext struct {
	Compare    Compare
	Equal      Equal
	Split      Split
	LowerBound []byte
	UpperBound []byte
	Prefix     []byte
	Stats      InternalIteratorStats
	Context    context.Context
        Logger     Logger
        Comparer   Comparer
}

Every iteration method of the InternalIterator interface could accept a *IterContext as the first parameter, which would point into a field of the pebble.Iterator struct when iterating using a top-level pebble.Iterator. With a quick survey, it looks like we would be able to remove:

  • mergingIter:
    • logger (16 bytes)
    • split (8 bytes)
    • prefix (24 bytes)
    • lower (24 bytes)
    • upper (24 bytes)
    • stats (8 bytes)
  • levelIter:
    • ctx (16 bytes)
    • logger (16 bytes)
    • comparer (8 bytes)
    • cmp (8 bytes)
    • split (8 bytes)
    • lower (24 bytes)
    • upper (24 bytes)
    • prefix (24 bytes)
  • singleLevelIterator:
    • ctx (16 bytes)
    • cmp (8 bytes)
    • lower (24 bytes)
    • upper (24 bytes)
  • colblk.DataBlockIter
    • cmp (8 bytes)
    • split (8 bytes)
  • colblk.IndexIter
    • cmp (8 bytes)
    • split (8 bytes)
  • arenaskl.Iterator
    • lower (24 bytes)
    • upper (24 bytes)

There are some tricky bits where global state diverges from an individual iterator's state. For example the levelIter avoids propagating bounds if a sstable falls wholly within the bounds. I think we'd still see benefit because, for example, some iterators maintain both global-level bounds and bounds to use within a more constrained iteration context.

Additionally, we might see some benefit to keeping frequently accessed fields (cmp, split) shared by improving cache locality.

Also propagating the iteration prefix through Next will facilitate #3794, because today leaf iterators don't retain the iteration prefix.

Jira issue: PEBBLE-289

@RaduBerinde
Copy link
Member

Maybe we can somehow tie the valueBlockReader allocation to the IterContext too - you would only be able to read a lazy value if the IterContext is still valid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Backlog
Development

No branches or pull requests

2 participants