⚡️ Speed up method KwargPackComponent._run_component by 7%
#142
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 7% (0.07x) speedup for
KwargPackComponent._run_componentinllama-index-core/llama_index/core/query_pipeline/components/argpacks.py⏱️ Runtime :
653 microseconds→612 microseconds(best of250runs)📝 Explanation and details
The optimized code achieves a 6% speedup through two key micro-optimizations that reduce overhead in the hot loop:
Key optimizations:
Local variable caching:
convert_fn = self.convert_fnstores the attribute lookup in a local variable, eliminating repeatedself.convert_fnattribute access during the loop iteration (4,135 times in the profiler results).Simplified iteration pattern: Changed from
for k, v in kwargs.items():tofor k in kwargs:withkwargs[k]access. This avoids the overhead of creating key-value tuples via.items()and unpacking them in each iteration.Performance impact:
convert_fnremain the dominant cost at ~71% of total timeconvert_fnapplied to multiple arguments (8-13% faster in many cases)Test case analysis:
The optimization particularly benefits scenarios with:
test_multiple_kwargs_with_convert: 13.1% faster)The changes preserve all behavior and maintain the same in-place mutation of the kwargs dictionary, making this a safe performance enhancement for existing code.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
import copy
function to test
from typing import Any, Callable, Optional
imports
import pytest
from llama_index.core.query_pipeline.components.argpacks import
KwargPackComponent
---------------------------
Unit Tests for _run_component
---------------------------
1. Basic Test Cases
def test_no_kwargs_returns_empty_dict():
# Test with no kwargs, should return {"output": {}}.
component = KwargPackComponent()
codeflash_output = component._run_component(); result = codeflash_output # 572ns -> 588ns (2.72% slower)
def test_single_kwarg_no_convert():
# Test with a single kwarg, no convert_fn.
component = KwargPackComponent()
codeflash_output = component._run_component(foo=1); result = codeflash_output # 710ns -> 752ns (5.59% slower)
def test_multiple_kwargs_no_convert():
# Test with multiple kwargs, no convert_fn.
component = KwargPackComponent()
codeflash_output = component._run_component(a=1, b="x", c=[1,2,3]); result = codeflash_output # 940ns -> 900ns (4.44% faster)
def test_single_kwarg_with_convert():
# Test with a single kwarg and a convert_fn.
component = KwargPackComponent(convert_fn=lambda x: x*2)
codeflash_output = component._run_component(foo=3); result = codeflash_output # 1.48μs -> 1.39μs (6.56% faster)
def test_multiple_kwargs_with_convert():
# Test with multiple kwargs and a convert_fn.
component = KwargPackComponent(convert_fn=str)
codeflash_output = component._run_component(a=1, b=2); result = codeflash_output # 1.59μs -> 1.41μs (13.1% faster)
def test_convert_fn_returns_different_type():
# Test with convert_fn returning a different type.
component = KwargPackComponent(convert_fn=lambda x: [x])
codeflash_output = component._run_component(a=7); result = codeflash_output # 1.26μs -> 1.26μs (0.158% slower)
2. Edge Test Cases
def test_convert_fn_is_none():
# Test explicitly passing convert_fn=None.
component = KwargPackComponent(convert_fn=None)
codeflash_output = component._run_component(x=1); result = codeflash_output # 724ns -> 749ns (3.34% slower)
def test_kwargs_with_none_values():
# Test kwargs where values are None.
component = KwargPackComponent()
codeflash_output = component._run_component(a=None, b=2); result = codeflash_output # 804ns -> 776ns (3.61% faster)
def test_convert_fn_returns_none():
# Test convert_fn that returns None.
component = KwargPackComponent(convert_fn=lambda x: None)
codeflash_output = component._run_component(a=1, b=2); result = codeflash_output # 1.43μs -> 1.31μs (9.72% faster)
def test_kwargs_with_mutable_objects():
# Test with mutable objects (lists/dicts) as kwargs.
mylist = [1,2]
mydict = {"x": 1}
component = KwargPackComponent()
codeflash_output = component._run_component(l=mylist, d=mydict); result = codeflash_output # 684ns -> 794ns (13.9% slower)
def test_convert_fn_modifies_mutable_object():
# Test convert_fn that mutates the input object.
def append_42(x):
if isinstance(x, list):
x.append(42)
return x
mylist = [1,2]
component = KwargPackComponent(convert_fn=append_42)
codeflash_output = component._run_component(l=mylist); result = codeflash_output # 1.57μs -> 1.53μs (2.82% faster)
def test_kwargs_with_special_keys():
# Test with keys that are Python keywords or special names.
component = KwargPackComponent()
codeflash_output = component.run_component(class=1, def_=2, lambda_=3); result = codeflash_output # 929ns -> 950ns (2.21% slower)
def test_convert_fn_raises_exception():
# Test convert_fn that raises an exception.
def bad_fn(x):
raise ValueError("bad!")
component = KwargPackComponent(convert_fn=bad_fn)
with pytest.raises(ValueError):
component._run_component(a=1) # 1.79μs -> 1.59μs (12.3% faster)
def test_kwargs_are_not_shared_between_calls():
# Ensure that kwargs from one call do not leak into another.
component = KwargPackComponent()
codeflash_output = component._run_component(a=1); result1 = codeflash_output # 744ns -> 738ns (0.813% faster)
codeflash_output = component._run_component(b=2); result2 = codeflash_output # 349ns -> 334ns (4.49% faster)
def test_convert_fn_is_identity():
# Test with convert_fn as identity function.
component = KwargPackComponent(convert_fn=lambda x: x)
codeflash_output = component._run_component(a=5, b=6); result = codeflash_output # 1.45μs -> 1.29μs (12.4% faster)
def test_kwargs_with_nonstring_keys_not_possible():
# kwargs only allows string keys, so this is not possible.
# This test is to ensure that the function does not accept non-string keys.
component = KwargPackComponent()
with pytest.raises(TypeError):
# This will raise before even entering the function
component._run_component({1: "a"})
def test_kwargs_are_copied_not_referenced():
# Ensure that changing the returned dict does not affect future calls.
component = KwargPackComponent()
out1 = component._run_component(a=1)["output"] # 712ns -> 760ns (6.32% slower)
out1["a"] = 999
out2 = component._run_component(a=1)["output"] # 414ns -> 395ns (4.81% faster)
def test_convert_fn_with_side_effects():
# Test convert_fn that counts how many times it's called.
calls = []
def counting_fn(x):
calls.append(x)
return x
component = KwargPackComponent(convert_fn=counting_fn)
codeflash_output = component._run_component(a=1, b=2); result = codeflash_output # 1.55μs -> 1.52μs (2.30% faster)
3. Large Scale Test Cases
def test_large_number_of_kwargs_no_convert():
# Test with a large number of kwargs, no convert_fn.
component = KwargPackComponent()
kwargs = {f"k{i}": i for i in range(1000)}
codeflash_output = component._run_component(**kwargs); result = codeflash_output # 42.3μs -> 41.9μs (0.997% faster)
def test_large_number_of_kwargs_with_convert():
# Test with a large number of kwargs and a convert_fn.
component = KwargPackComponent(convert_fn=lambda x: x+1)
kwargs = {f"k{i}": i for i in range(1000)}
expected = {k: v+1 for k,v in kwargs.items()}
codeflash_output = component._run_component(**kwargs); result = codeflash_output # 128μs -> 117μs (8.88% faster)
def test_large_nested_structures_with_convert():
# Test with large nested structures and a convert_fn.
def flatten(x):
if isinstance(x, list):
return sum(x)
return x
component = KwargPackComponent(convert_fn=flatten)
kwargs = {f"l{i}": [i, i+1, i+2] for i in range(500)}
expected = {k: sum(v) for k,v in kwargs.items()}
codeflash_output = component._run_component(**kwargs); result = codeflash_output # 80.9μs -> 75.7μs (6.94% faster)
def test_performance_large_scale():
# Test that function does not degrade with large inputs (sanity check).
import time
component = KwargPackComponent(convert_fn=lambda x: x)
kwargs = {f"k{i}": i for i in range(1000)}
start = time.time()
codeflash_output = component._run_component(**kwargs); result = codeflash_output # 106μs -> 98.1μs (8.77% faster)
end = time.time()
codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from typing import Any, Callable, Optional
imports
import pytest # used for our unit tests
from llama_index.core.query_pipeline.components.argpacks import
KwargPackComponent
unit tests
--------- Basic Test Cases ---------
def test_basic_no_kwargs():
# Test with no kwargs, should return empty dict in output
comp = KwargPackComponent()
codeflash_output = comp._run_component(); result = codeflash_output # 588ns -> 535ns (9.91% faster)
def test_basic_single_kwarg():
# Test with a single kwarg
comp = KwargPackComponent()
codeflash_output = comp._run_component(foo=42); result = codeflash_output # 761ns -> 753ns (1.06% faster)
def test_basic_multiple_kwargs():
# Test with multiple kwargs
comp = KwargPackComponent()
codeflash_output = comp._run_component(a=1, b=2, c=3); result = codeflash_output # 871ns -> 907ns (3.97% slower)
def test_basic_convert_fn_applied():
# Test with convert_fn that adds 1 to each value
comp = KwargPackComponent(convert_fn=lambda x: x + 1)
codeflash_output = comp._run_component(a=1, b=2); result = codeflash_output # 1.50μs -> 1.36μs (10.4% faster)
def test_basic_convert_fn_stringify():
# Test with convert_fn that stringifies values
comp = KwargPackComponent(convert_fn=str)
codeflash_output = comp._run_component(foo=10, bar=True); result = codeflash_output # 1.66μs -> 1.54μs (7.32% faster)
--------- Edge Test Cases ---------
def test_edge_empty_string_key():
# Keys can't be empty in kwargs but can be tested by passing a dict directly
comp = KwargPackComponent()
codeflash_output = comp._run_component(**{"": "empty"}); result = codeflash_output # 788ns -> 748ns (5.35% faster)
def test_edge_special_char_keys():
# Test with keys containing special characters
comp = KwargPackComponent()
codeflash_output = comp._run_component(**{"@key!": 123, "sp ace": "val"}); result = codeflash_output # 762ns -> 741ns (2.83% faster)
def test_edge_none_value():
# Test with None value
comp = KwargPackComponent()
codeflash_output = comp._run_component(foo=None); result = codeflash_output # 758ns -> 760ns (0.263% slower)
def test_edge_convert_fn_returns_none():
# convert_fn returns None for all values
comp = KwargPackComponent(convert_fn=lambda x: None)
codeflash_output = comp._run_component(a=1, b=2); result = codeflash_output # 1.47μs -> 1.35μs (8.43% faster)
def test_edge_convert_fn_raises():
# convert_fn raises an exception for one value
def fn(x):
if x == 'bad':
raise ValueError("bad value")
return x
comp = KwargPackComponent(convert_fn=fn)
with pytest.raises(ValueError):
comp._run_component(a='bad', b='good') # 1.81μs -> 1.67μs (8.55% faster)
def test_edge_mutable_values():
# Test with mutable values (lists, dicts)
comp = KwargPackComponent()
codeflash_output = comp._run_component(lst=[1,2], dct={'x': 1}); result = codeflash_output # 806ns -> 790ns (2.03% faster)
def test_edge_convert_fn_mutates_object():
# convert_fn mutates object in place
def fn(x):
if isinstance(x, list):
x.append('mutated')
return x
comp = KwargPackComponent(convert_fn=fn)
val = [1]
codeflash_output = comp._run_component(foo=val); result = codeflash_output # 1.69μs -> 1.56μs (7.74% faster)
def test_edge_convert_fn_on_empty_kwargs():
# convert_fn should not be called if no kwargs
called = []
def fn(x):
called.append(x)
return x
comp = KwargPackComponent(convert_fn=fn)
codeflash_output = comp._run_component(); result = codeflash_output # 700ns -> 648ns (8.02% faster)
def test_edge_convert_fn_returns_key_type():
# convert_fn returns a type as value
comp = KwargPackComponent(convert_fn=lambda x: type(x))
codeflash_output = comp._run_component(a=1); result = codeflash_output # 1.40μs -> 1.37μs (2.33% faster)
def test_edge_convert_fn_is_identity():
# convert_fn is identity function
comp = KwargPackComponent(convert_fn=lambda x: x)
codeflash_output = comp._run_component(foo="bar"); result = codeflash_output # 1.28μs -> 1.22μs (4.66% faster)
--------- Large Scale Test Cases ---------
def test_large_scale_1000_kwargs():
# Test with 1000 kwargs
comp = KwargPackComponent()
kwargs = {f"key{i}": i for i in range(1000)}
codeflash_output = comp._run_component(**kwargs); result = codeflash_output # 40.6μs -> 41.0μs (0.886% slower)
def test_large_scale_convert_fn_on_1000_kwargs():
# Test with 1000 kwargs and convert_fn doubles values
comp = KwargPackComponent(convert_fn=lambda x: x * 2)
kwargs = {f"key{i}": i for i in range(1000)}
expected = {k: v * 2 for k, v in kwargs.items()}
codeflash_output = comp._run_component(**kwargs); result = codeflash_output # 127μs -> 117μs (7.75% faster)
def test_large_scale_kwargs_with_large_values():
# Test with kwargs where each value is a large list
comp = KwargPackComponent()
kwargs = {f"key{i}": [i] * 100 for i in range(10)}
codeflash_output = comp._run_component(**kwargs); result = codeflash_output # 1.49μs -> 1.54μs (2.86% slower)
def test_large_scale_convert_fn_expensive():
# convert_fn is expensive (simulate with sum of list)
comp = KwargPackComponent(convert_fn=lambda x: sum(x) if isinstance(x, list) else x)
kwargs = {f"key{i}": [i] * 10 for i in range(100)}
expected = {k: sum(v) for k, v in kwargs.items()}
codeflash_output = comp._run_component(**kwargs); result = codeflash_output # 19.7μs -> 18.4μs (7.13% faster)
def test_large_scale_convert_fn_stringify_keys_and_values():
# convert_fn stringifies values, keys are already strings
comp = KwargPackComponent(convert_fn=str)
kwargs = {str(i): i for i in range(500)}
expected = {str(i): str(i) for i in range(500)}
codeflash_output = comp._run_component(**kwargs); result = codeflash_output # 60.7μs -> 55.8μs (8.64% faster)
--------- Miscellaneous Robustness ---------
def test_robustness_kwargs_types():
# Test with various types as values
comp = KwargPackComponent()
codeflash_output = comp._run_component(
int_val=1,
float_val=2.5,
str_val="abc",
bool_val=True,
none_val=None,
tuple_val=(1,2),
list_val=[1,2],
dict_val={'a': 1}
); result = codeflash_output # 1.29μs -> 1.31μs (1.98% slower)
expected = {
"int_val": 1,
"float_val": 2.5,
"str_val": "abc",
"bool_val": True,
"none_val": None,
"tuple_val": (1,2),
"list_val": [1,2],
"dict_val": {'a': 1}
}
def test_robustness_convert_fn_on_various_types():
# convert_fn returns type name
comp = KwargPackComponent(convert_fn=lambda x: type(x).name)
codeflash_output = comp._run_component(
int_val=1,
float_val=2.5,
str_val="abc",
bool_val=True,
none_val=None,
tuple_val=(1,2),
list_val=[1,2],
dict_val={'a': 1}
); result = codeflash_output # 3.64μs -> 3.48μs (4.48% faster)
expected = {
"int_val": "int",
"float_val": "float",
"str_val": "str",
"bool_val": "bool",
"none_val": "NoneType",
"tuple_val": "tuple",
"list_val": "list",
"dict_val": "dict"
}
def test_robustness_convert_fn_is_lambda_with_side_effect():
# convert_fn increments a counter
counter = {"count": 0}
def fn(x):
counter["count"] += 1
return x
comp = KwargPackComponent(convert_fn=fn)
codeflash_output = comp._run_component(a=1, b=2, c=3); result = codeflash_output # 1.91μs -> 1.92μs (0.936% slower)
def test_robustness_convert_fn_is_none():
# convert_fn is explicitly None
comp = KwargPackComponent(convert_fn=None)
codeflash_output = comp._run_component(a=1); result = codeflash_output # 719ns -> 743ns (3.23% slower)
def test_robustness_kwargs_are_not_mutated_outside():
# Ensure original kwargs dict is not mutated outside
comp = KwargPackComponent(convert_fn=lambda x: x + 1)
kwargs = {"a": 1, "b": 2}
codeflash_output = comp._run_component(**kwargs); result = codeflash_output # 1.42μs -> 1.35μs (5.42% faster)
codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
To edit these changes
git checkout codeflash/optimize-KwargPackComponent._run_component-mhvj6ot1and push.