⚡️ Speed up function _apply_transforms by 42%
#628
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 42% (0.42x) speedup for
_apply_transformsinmarimo/_plugins/ui/_impl/dataframes/transforms/apply.py⏱️ Runtime :
15.2 microseconds→10.8 microseconds(best of48runs)📝 Explanation and details
The optimized code achieves a 41% speedup by replacing a linear chain of 12
ifstatements with a single dictionary lookup, eliminating expensive repeated identity comparisons.Key Optimizations:
Dictionary Dispatch: The original code used a chain of
if transform.type is TransformType.Xstatements that required up to 12 identity comparisons per call. The optimized version uses a precomputed dictionary_transform_type_to_handler_methodthat provides O(1) lookup time regardless of transform type.Reduced Branching: Instead of 12 conditional branches, there's now just one dictionary lookup followed by a single
getattr()call. This eliminates the CPU pipeline stalls caused by unpredictable branching.Attribute Caching: The
transforms.transformslist is cached astransforms_listto avoid repeated attribute lookups in the loop.Performance Impact:
_handlefunction's total time dropped from 211µs to 107µs (49% faster)method_name = _transform_type_to_handler_method.get(transform.type)) takes only 25µs vs the original chain of comparisons taking 140µsHot Path Benefits:
Based on the function references,
_apply_transformsis called from theapply()method in dataframe transformation pipelines, potentially processing multiple transforms per operation. This optimization will have compounding benefits when processing batches of transforms, as each_handlecall is now significantly faster.The optimization is particularly effective for transforms later in the enum sequence (like
UNIQUE,EXPAND_DICT) that previously required checking all preceding conditions.✅ Correctness verification report:
⚙️ Existing Unit Tests and Runtime
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-_apply_transforms-mhwuuqp1and push.