Skip to content

Conversation

@Sacul0457
Copy link
Contributor

@Sacul0457 Sacul0457 commented Nov 17, 2025

Summary

This PR attempts to optimise utils.find and utils.as_chunks.

You can read more here, but just to summarise:

  • Most workloads are small when it comes to utils.find, but it is still used a lot in dpy and user code, so speeding it up will still help overall if you know what I mean. Benchmarks on all versions (3.8 - 3.15) are around 2% - 20% faster

Also, for anyone seeing this, feel free to benchmark these changes and see if it matches. Benchmarks can be found here. It is greatly appreciated, thanks!
Note: utlils.as_chunks implementation changed, so use the new one in this PR when benchmarking.

Checklist

  • If code changes were made then they have been tested.
    • I have updated the documentation to reflect the changes.
  • This PR fixes an issue.
  • This PR adds something new (e.g. new method or parameters).
  • This PR is a breaking change (e.g. methods or parameters removed/renamed)
  • This PR is not a code change (e.g. documentation, README, ...)

@Rapptz
Copy link
Owner

Rapptz commented Nov 17, 2025

Instead of doing hasattr you can try getattr(x, attr, None) is not None. Usually this is insufficient for checking since None can be valid but for both of the attributes we're checking it breaks the invariant if it's None so it might be worth doing. Chances are ultimately the attribute checking would be slower though.

@Sacul0457
Copy link
Contributor Author

Chances are ultimately the attribute checking would be slower though.

Yeah, unfortunately it is. I think sticking with isinstance(iterator, collections.abc.Sequence) is fine. On smaller BMs it's still around 60% faster and it's more readable too.

@mikeshardmind
Copy link
Contributor

don't have a quiet local machine to benchmark on right now, but I suspect using itertools.islice will be better for as_chunks:

import sys
from itertools import islice


def _chunk(iterable: Iterable[T], max_size: int) -> Iterable[list[T]]:
    iterator = iter(iterable)
    while batch := list(islice(iterator, n)):
        yield batch

@Sacul0457
Copy link
Contributor Author

don't have a quiet local machine to benchmark on right now, but I suspect using itertools.islice will be better for as_chunks:

import sys
from itertools import islice


def _chunk(iterable: Iterable[T], max_size: int) -> Iterable[list[T]]:
    iterator = iter(iterable)
    while batch := list(islice(iterator, n)):
        yield batch

Yes, it does seem like using itertools.islice is faster than the current slow path. However, it still seems that normal slicing is 30% - 50% faster than itertools.islice. So, I think we can combine your idea, and replace the current slow path with itertools.islice, like so:

def _chunk(iterator: Iterable[T], max_size: int) -> Iterator[List[T]]:
    # Specialise iterables/iterators that can be sliced which is much faster
    if isinstance(iterator, Sequence):
        for i in range(0, len(iterator), max_size):
            yield list(iterator[i : max_size + i])
    else:
        # Fallback to slower path
        iterator = iter(iterator)
        while batch := list(islice(iterator, max_size)):
            yield batch

This newer version is around 84% - 88% faster on the fast path, while on slow path, it is 66% - 78% faster!

@Rapptz
Copy link
Owner

Rapptz commented Nov 18, 2025

FYI walrus operator is banned in this codebase.

@Sacul0457
Copy link
Contributor Author

oh alr, this should be the equivalent I believe? I don't use the walrus operator often

iterator = iter(iterator)
while True:
    batch = list(islice(iterator, max_size))
    if not batch:
        break
    yield batch

@Rapptz
Copy link
Owner

Rapptz commented Nov 18, 2025

Yeah that works too.

@Rapptz Rapptz merged commit 9be91cb into Rapptz:master Nov 19, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants