Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 12, 2025

📄 7% (0.07x) speedup for KwargPackComponent._run_component in llama-index-core/llama_index/core/query_pipeline/components/argpacks.py

⏱️ Runtime : 653 microseconds 612 microseconds (best of 250 runs)

📝 Explanation and details

The optimized code achieves a 6% speedup through two key micro-optimizations that reduce overhead in the hot loop:

Key optimizations:

  1. Local variable caching: convert_fn = self.convert_fn stores the attribute lookup in a local variable, eliminating repeated self.convert_fn attribute access during the loop iteration (4,135 times in the profiler results).

  2. Simplified iteration pattern: Changed from for k, v in kwargs.items(): to for k in kwargs: with kwargs[k] access. This avoids the overhead of creating key-value tuples via .items() and unpacking them in each iteration.

Performance impact:

  • The line profiler shows the loop overhead reduced from 850,439ns to 765,917ns (10% faster loop iteration)
  • Function calls to convert_fn remain the dominant cost at ~71% of total time
  • Most significant gains occur in test cases with convert_fn applied to multiple arguments (8-13% faster in many cases)

Test case analysis:
The optimization particularly benefits scenarios with:

  • Multiple kwargs with conversion functions (e.g., test_multiple_kwargs_with_convert: 13.1% faster)
  • Large-scale operations with 1000+ kwargs and conversion (7-9% faster)
  • Any workload where the loop executes frequently with convert_fn enabled

The changes preserve all behavior and maintain the same in-place mutation of the kwargs dictionary, making this a safe performance enhancement for existing code.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 95 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime

import copy

function to test

from typing import Any, Callable, Optional

imports

import pytest
from llama_index.core.query_pipeline.components.argpacks import
KwargPackComponent

---------------------------

Unit Tests for _run_component

---------------------------

1. Basic Test Cases

def test_no_kwargs_returns_empty_dict():
# Test with no kwargs, should return {"output": {}}.
component = KwargPackComponent()
codeflash_output = component._run_component(); result = codeflash_output # 572ns -> 588ns (2.72% slower)

def test_single_kwarg_no_convert():
# Test with a single kwarg, no convert_fn.
component = KwargPackComponent()
codeflash_output = component._run_component(foo=1); result = codeflash_output # 710ns -> 752ns (5.59% slower)

def test_multiple_kwargs_no_convert():
# Test with multiple kwargs, no convert_fn.
component = KwargPackComponent()
codeflash_output = component._run_component(a=1, b="x", c=[1,2,3]); result = codeflash_output # 940ns -> 900ns (4.44% faster)

def test_single_kwarg_with_convert():
# Test with a single kwarg and a convert_fn.
component = KwargPackComponent(convert_fn=lambda x: x*2)
codeflash_output = component._run_component(foo=3); result = codeflash_output # 1.48μs -> 1.39μs (6.56% faster)

def test_multiple_kwargs_with_convert():
# Test with multiple kwargs and a convert_fn.
component = KwargPackComponent(convert_fn=str)
codeflash_output = component._run_component(a=1, b=2); result = codeflash_output # 1.59μs -> 1.41μs (13.1% faster)

def test_convert_fn_returns_different_type():
# Test with convert_fn returning a different type.
component = KwargPackComponent(convert_fn=lambda x: [x])
codeflash_output = component._run_component(a=7); result = codeflash_output # 1.26μs -> 1.26μs (0.158% slower)

2. Edge Test Cases

def test_convert_fn_is_none():
# Test explicitly passing convert_fn=None.
component = KwargPackComponent(convert_fn=None)
codeflash_output = component._run_component(x=1); result = codeflash_output # 724ns -> 749ns (3.34% slower)

def test_kwargs_with_none_values():
# Test kwargs where values are None.
component = KwargPackComponent()
codeflash_output = component._run_component(a=None, b=2); result = codeflash_output # 804ns -> 776ns (3.61% faster)

def test_convert_fn_returns_none():
# Test convert_fn that returns None.
component = KwargPackComponent(convert_fn=lambda x: None)
codeflash_output = component._run_component(a=1, b=2); result = codeflash_output # 1.43μs -> 1.31μs (9.72% faster)

def test_kwargs_with_mutable_objects():
# Test with mutable objects (lists/dicts) as kwargs.
mylist = [1,2]
mydict = {"x": 1}
component = KwargPackComponent()
codeflash_output = component._run_component(l=mylist, d=mydict); result = codeflash_output # 684ns -> 794ns (13.9% slower)

def test_convert_fn_modifies_mutable_object():
# Test convert_fn that mutates the input object.
def append_42(x):
if isinstance(x, list):
x.append(42)
return x
mylist = [1,2]
component = KwargPackComponent(convert_fn=append_42)
codeflash_output = component._run_component(l=mylist); result = codeflash_output # 1.57μs -> 1.53μs (2.82% faster)

def test_kwargs_with_special_keys():
# Test with keys that are Python keywords or special names.
component = KwargPackComponent()
codeflash_output = component.run_component(class=1, def_=2, lambda_=3); result = codeflash_output # 929ns -> 950ns (2.21% slower)

def test_convert_fn_raises_exception():
# Test convert_fn that raises an exception.
def bad_fn(x):
raise ValueError("bad!")
component = KwargPackComponent(convert_fn=bad_fn)
with pytest.raises(ValueError):
component._run_component(a=1) # 1.79μs -> 1.59μs (12.3% faster)

def test_kwargs_are_not_shared_between_calls():
# Ensure that kwargs from one call do not leak into another.
component = KwargPackComponent()
codeflash_output = component._run_component(a=1); result1 = codeflash_output # 744ns -> 738ns (0.813% faster)
codeflash_output = component._run_component(b=2); result2 = codeflash_output # 349ns -> 334ns (4.49% faster)

def test_convert_fn_is_identity():
# Test with convert_fn as identity function.
component = KwargPackComponent(convert_fn=lambda x: x)
codeflash_output = component._run_component(a=5, b=6); result = codeflash_output # 1.45μs -> 1.29μs (12.4% faster)

def test_kwargs_with_nonstring_keys_not_possible():
# kwargs only allows string keys, so this is not possible.
# This test is to ensure that the function does not accept non-string keys.
component = KwargPackComponent()
with pytest.raises(TypeError):
# This will raise before even entering the function
component._run_component(
{1: "a"})

def test_kwargs_are_copied_not_referenced():
# Ensure that changing the returned dict does not affect future calls.
component = KwargPackComponent()
out1 = component._run_component(a=1)["output"] # 712ns -> 760ns (6.32% slower)
out1["a"] = 999
out2 = component._run_component(a=1)["output"] # 414ns -> 395ns (4.81% faster)

def test_convert_fn_with_side_effects():
# Test convert_fn that counts how many times it's called.
calls = []
def counting_fn(x):
calls.append(x)
return x
component = KwargPackComponent(convert_fn=counting_fn)
codeflash_output = component._run_component(a=1, b=2); result = codeflash_output # 1.55μs -> 1.52μs (2.30% faster)

3. Large Scale Test Cases

def test_large_number_of_kwargs_no_convert():
# Test with a large number of kwargs, no convert_fn.
component = KwargPackComponent()
kwargs = {f"k{i}": i for i in range(1000)}
codeflash_output = component._run_component(**kwargs); result = codeflash_output # 42.3μs -> 41.9μs (0.997% faster)

def test_large_number_of_kwargs_with_convert():
# Test with a large number of kwargs and a convert_fn.
component = KwargPackComponent(convert_fn=lambda x: x+1)
kwargs = {f"k{i}": i for i in range(1000)}
expected = {k: v+1 for k,v in kwargs.items()}
codeflash_output = component._run_component(**kwargs); result = codeflash_output # 128μs -> 117μs (8.88% faster)

def test_large_nested_structures_with_convert():
# Test with large nested structures and a convert_fn.
def flatten(x):
if isinstance(x, list):
return sum(x)
return x
component = KwargPackComponent(convert_fn=flatten)
kwargs = {f"l{i}": [i, i+1, i+2] for i in range(500)}
expected = {k: sum(v) for k,v in kwargs.items()}
codeflash_output = component._run_component(**kwargs); result = codeflash_output # 80.9μs -> 75.7μs (6.94% faster)

def test_performance_large_scale():
# Test that function does not degrade with large inputs (sanity check).
import time
component = KwargPackComponent(convert_fn=lambda x: x)
kwargs = {f"k{i}": i for i in range(1000)}
start = time.time()
codeflash_output = component._run_component(**kwargs); result = codeflash_output # 106μs -> 98.1μs (8.77% faster)
end = time.time()

codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

#------------------------------------------------
from typing import Any, Callable, Optional

imports

import pytest # used for our unit tests
from llama_index.core.query_pipeline.components.argpacks import
KwargPackComponent

unit tests

--------- Basic Test Cases ---------

def test_basic_no_kwargs():
# Test with no kwargs, should return empty dict in output
comp = KwargPackComponent()
codeflash_output = comp._run_component(); result = codeflash_output # 588ns -> 535ns (9.91% faster)

def test_basic_single_kwarg():
# Test with a single kwarg
comp = KwargPackComponent()
codeflash_output = comp._run_component(foo=42); result = codeflash_output # 761ns -> 753ns (1.06% faster)

def test_basic_multiple_kwargs():
# Test with multiple kwargs
comp = KwargPackComponent()
codeflash_output = comp._run_component(a=1, b=2, c=3); result = codeflash_output # 871ns -> 907ns (3.97% slower)

def test_basic_convert_fn_applied():
# Test with convert_fn that adds 1 to each value
comp = KwargPackComponent(convert_fn=lambda x: x + 1)
codeflash_output = comp._run_component(a=1, b=2); result = codeflash_output # 1.50μs -> 1.36μs (10.4% faster)

def test_basic_convert_fn_stringify():
# Test with convert_fn that stringifies values
comp = KwargPackComponent(convert_fn=str)
codeflash_output = comp._run_component(foo=10, bar=True); result = codeflash_output # 1.66μs -> 1.54μs (7.32% faster)

--------- Edge Test Cases ---------

def test_edge_empty_string_key():
# Keys can't be empty in kwargs but can be tested by passing a dict directly
comp = KwargPackComponent()
codeflash_output = comp._run_component(**{"": "empty"}); result = codeflash_output # 788ns -> 748ns (5.35% faster)

def test_edge_special_char_keys():
# Test with keys containing special characters
comp = KwargPackComponent()
codeflash_output = comp._run_component(**{"@key!": 123, "sp ace": "val"}); result = codeflash_output # 762ns -> 741ns (2.83% faster)

def test_edge_none_value():
# Test with None value
comp = KwargPackComponent()
codeflash_output = comp._run_component(foo=None); result = codeflash_output # 758ns -> 760ns (0.263% slower)

def test_edge_convert_fn_returns_none():
# convert_fn returns None for all values
comp = KwargPackComponent(convert_fn=lambda x: None)
codeflash_output = comp._run_component(a=1, b=2); result = codeflash_output # 1.47μs -> 1.35μs (8.43% faster)

def test_edge_convert_fn_raises():
# convert_fn raises an exception for one value
def fn(x):
if x == 'bad':
raise ValueError("bad value")
return x
comp = KwargPackComponent(convert_fn=fn)
with pytest.raises(ValueError):
comp._run_component(a='bad', b='good') # 1.81μs -> 1.67μs (8.55% faster)

def test_edge_mutable_values():
# Test with mutable values (lists, dicts)
comp = KwargPackComponent()
codeflash_output = comp._run_component(lst=[1,2], dct={'x': 1}); result = codeflash_output # 806ns -> 790ns (2.03% faster)

def test_edge_convert_fn_mutates_object():
# convert_fn mutates object in place
def fn(x):
if isinstance(x, list):
x.append('mutated')
return x
comp = KwargPackComponent(convert_fn=fn)
val = [1]
codeflash_output = comp._run_component(foo=val); result = codeflash_output # 1.69μs -> 1.56μs (7.74% faster)

def test_edge_convert_fn_on_empty_kwargs():
# convert_fn should not be called if no kwargs
called = []
def fn(x):
called.append(x)
return x
comp = KwargPackComponent(convert_fn=fn)
codeflash_output = comp._run_component(); result = codeflash_output # 700ns -> 648ns (8.02% faster)

def test_edge_convert_fn_returns_key_type():
# convert_fn returns a type as value
comp = KwargPackComponent(convert_fn=lambda x: type(x))
codeflash_output = comp._run_component(a=1); result = codeflash_output # 1.40μs -> 1.37μs (2.33% faster)

def test_edge_convert_fn_is_identity():
# convert_fn is identity function
comp = KwargPackComponent(convert_fn=lambda x: x)
codeflash_output = comp._run_component(foo="bar"); result = codeflash_output # 1.28μs -> 1.22μs (4.66% faster)

--------- Large Scale Test Cases ---------

def test_large_scale_1000_kwargs():
# Test with 1000 kwargs
comp = KwargPackComponent()
kwargs = {f"key{i}": i for i in range(1000)}
codeflash_output = comp._run_component(**kwargs); result = codeflash_output # 40.6μs -> 41.0μs (0.886% slower)

def test_large_scale_convert_fn_on_1000_kwargs():
# Test with 1000 kwargs and convert_fn doubles values
comp = KwargPackComponent(convert_fn=lambda x: x * 2)
kwargs = {f"key{i}": i for i in range(1000)}
expected = {k: v * 2 for k, v in kwargs.items()}
codeflash_output = comp._run_component(**kwargs); result = codeflash_output # 127μs -> 117μs (7.75% faster)

def test_large_scale_kwargs_with_large_values():
# Test with kwargs where each value is a large list
comp = KwargPackComponent()
kwargs = {f"key{i}": [i] * 100 for i in range(10)}
codeflash_output = comp._run_component(**kwargs); result = codeflash_output # 1.49μs -> 1.54μs (2.86% slower)

def test_large_scale_convert_fn_expensive():
# convert_fn is expensive (simulate with sum of list)
comp = KwargPackComponent(convert_fn=lambda x: sum(x) if isinstance(x, list) else x)
kwargs = {f"key{i}": [i] * 10 for i in range(100)}
expected = {k: sum(v) for k, v in kwargs.items()}
codeflash_output = comp._run_component(**kwargs); result = codeflash_output # 19.7μs -> 18.4μs (7.13% faster)

def test_large_scale_convert_fn_stringify_keys_and_values():
# convert_fn stringifies values, keys are already strings
comp = KwargPackComponent(convert_fn=str)
kwargs = {str(i): i for i in range(500)}
expected = {str(i): str(i) for i in range(500)}
codeflash_output = comp._run_component(**kwargs); result = codeflash_output # 60.7μs -> 55.8μs (8.64% faster)

--------- Miscellaneous Robustness ---------

def test_robustness_kwargs_types():
# Test with various types as values
comp = KwargPackComponent()
codeflash_output = comp._run_component(
int_val=1,
float_val=2.5,
str_val="abc",
bool_val=True,
none_val=None,
tuple_val=(1,2),
list_val=[1,2],
dict_val={'a': 1}
); result = codeflash_output # 1.29μs -> 1.31μs (1.98% slower)
expected = {
"int_val": 1,
"float_val": 2.5,
"str_val": "abc",
"bool_val": True,
"none_val": None,
"tuple_val": (1,2),
"list_val": [1,2],
"dict_val": {'a': 1}
}

def test_robustness_convert_fn_on_various_types():
# convert_fn returns type name
comp = KwargPackComponent(convert_fn=lambda x: type(x).name)
codeflash_output = comp._run_component(
int_val=1,
float_val=2.5,
str_val="abc",
bool_val=True,
none_val=None,
tuple_val=(1,2),
list_val=[1,2],
dict_val={'a': 1}
); result = codeflash_output # 3.64μs -> 3.48μs (4.48% faster)
expected = {
"int_val": "int",
"float_val": "float",
"str_val": "str",
"bool_val": "bool",
"none_val": "NoneType",
"tuple_val": "tuple",
"list_val": "list",
"dict_val": "dict"
}

def test_robustness_convert_fn_is_lambda_with_side_effect():
# convert_fn increments a counter
counter = {"count": 0}
def fn(x):
counter["count"] += 1
return x
comp = KwargPackComponent(convert_fn=fn)
codeflash_output = comp._run_component(a=1, b=2, c=3); result = codeflash_output # 1.91μs -> 1.92μs (0.936% slower)

def test_robustness_convert_fn_is_none():
# convert_fn is explicitly None
comp = KwargPackComponent(convert_fn=None)
codeflash_output = comp._run_component(a=1); result = codeflash_output # 719ns -> 743ns (3.23% slower)

def test_robustness_kwargs_are_not_mutated_outside():
# Ensure original kwargs dict is not mutated outside
comp = KwargPackComponent(convert_fn=lambda x: x + 1)
kwargs = {"a": 1, "b": 2}
codeflash_output = comp._run_component(**kwargs); result = codeflash_output # 1.42μs -> 1.35μs (5.42% faster)

codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-KwargPackComponent._run_component-mhvj6ot1 and push.

Codeflash Static Badge

The optimized code achieves a **6% speedup** through two key micro-optimizations that reduce overhead in the hot loop:

**Key optimizations:**

1. **Local variable caching**: `convert_fn = self.convert_fn` stores the attribute lookup in a local variable, eliminating repeated `self.convert_fn` attribute access during the loop iteration (4,135 times in the profiler results).

2. **Simplified iteration pattern**: Changed from `for k, v in kwargs.items():` to `for k in kwargs:` with `kwargs[k]` access. This avoids the overhead of creating key-value tuples via `.items()` and unpacking them in each iteration.

**Performance impact:**
- The line profiler shows the loop overhead reduced from 850,439ns to 765,917ns (10% faster loop iteration)
- Function calls to `convert_fn` remain the dominant cost at ~71% of total time
- Most significant gains occur in test cases with `convert_fn` applied to multiple arguments (8-13% faster in many cases)

**Test case analysis:**
The optimization particularly benefits scenarios with:
- Multiple kwargs with conversion functions (e.g., `test_multiple_kwargs_with_convert`: 13.1% faster)
- Large-scale operations with 1000+ kwargs and conversion (7-9% faster)
- Any workload where the loop executes frequently with convert_fn enabled

The changes preserve all behavior and maintain the same in-place mutation of the kwargs dictionary, making this a safe performance enhancement for existing code.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 12, 2025 04:59
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant