⚡️ Speed up method ReActAgent._get_response by 6%
#131
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 6% (0.06x) speedup for
ReActAgent._get_responseinllama-index-core/llama_index/core/agent/legacy/react/base.py⏱️ Runtime :
18.6 microseconds→17.7 microseconds(best of51runs)📝 Explanation and details
The optimized code achieves a 5% speedup through several micro-optimizations focused on reducing redundant operations:
Key Optimizations:
Single length calculation: The most significant improvement comes from calculating
len(current_reasoning)once and storing it incurr_len, rather than callinglen()twice in the conditional checks. This eliminates one function call per invocation.Conditional formatter/parser initialization: Instead of using
orexpressions that always evaluate both sides, the code uses explicit if-else blocks to only instantiateReActChatFormatter()andReActOutputParser()when actually needed.Direct method reference: For the tool retriever case, the code assigns
tool_retriever_c.retrievedirectly instead of wrapping it in a lambda, eliminating the lambda call overhead.Pythonic truthiness check: Using
if tools:instead ofif len(tools) > 0leverages Python's efficient truthiness evaluation for sequences.Performance Impact:
The line profiler shows the length calculation optimization is most effective - the conditional checks now run ~25% faster (from 1158.9ns to 655.2ns per hit for the first check). While individual gains are small, they compound since
_get_responseappears to be called frequently based on the test results showing consistent 3-7% improvements across most test cases.Test Case Performance:
The optimization is particularly effective for normal execution paths (single/multiple reasoning steps showing 7.81% and 3.70% improvements), while error cases show minimal or slightly negative impact due to the additional variable assignment overhead when exceptions are raised immediately.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
import pytest
from llama_index.core.agent.legacy.react.base import ReActAgent
Minimal stubs for dependencies, since we can't import llama_index
class BaseReasoningStep:
pass
class ResponseReasoningStep(BaseReasoningStep):
def init(self, response):
self.response = response
class AgentChatResponse:
def init(self, response, sources):
self.response = response
self.sources = sources
class DummyLLM:
def init(self):
self.callback_manager = None
class DummyMemory:
pass
from llama_index.core.agent.legacy.react.base import ReActAgent
----------------------
UNIT TESTS START HERE
----------------------
1. BASIC TEST CASES
def test_basic_single_reasoning_step():
"""Test with a single valid reasoning step."""
agent = ReActAgent(
tools=[],
llm=DummyLLM(),
memory=DummyMemory(),
max_iterations=5
)
steps = [ResponseReasoningStep("Hello world!")]
codeflash_output = agent._get_response(steps); response = codeflash_output # 3.13μs -> 2.91μs (7.81% faster)
def test_basic_multiple_reasoning_steps():
"""Test with several reasoning steps, last one is used."""
agent = ReActAgent(
tools=[],
llm=DummyLLM(),
memory=DummyMemory(),
max_iterations=5
)
steps = [
ResponseReasoningStep("Step 1"),
ResponseReasoningStep("Step 2"),
ResponseReasoningStep("Final answer"),
]
codeflash_output = agent._get_response(steps); response = codeflash_output # 2.46μs -> 2.38μs (3.70% faster)
def test_empty_reasoning_raises():
"""Test that empty reasoning steps raises ValueError."""
agent = ReActAgent(
tools=[],
llm=DummyLLM(),
memory=DummyMemory(),
max_iterations=3
)
with pytest.raises(ValueError) as excinfo:
agent._get_response([]) # 1.18μs -> 1.22μs (3.28% slower)
def test_max_iterations_raises():
"""Test that hitting max_iterations raises ValueError."""
agent = ReActAgent(
tools=[],
llm=DummyLLM(),
memory=DummyMemory(),
max_iterations=3
)
steps = [
ResponseReasoningStep("1"),
ResponseReasoningStep("2"),
ResponseReasoningStep("3"),
] # len(steps) == max_iterations
with pytest.raises(ValueError) as excinfo:
agent._get_response(steps) # 1.20μs -> 1.22μs (1.73% slower)
def test_non_response_reasoning_step_last():
"""Test that a non-ResponseReasoningStep as last step raises AttributeError."""
agent = ReActAgent(
tools=[],
llm=DummyLLM(),
memory=DummyMemory(),
max_iterations=5
)
class DummyStep(BaseReasoningStep):
pass
steps = [
ResponseReasoningStep("First"),
DummyStep(),
]
# Should raise AttributeError because DummyStep has no 'response' attribute
with pytest.raises(AttributeError):
agent._get_response(steps) # 2.14μs -> 2.01μs (6.42% faster)
def test_none_in_steps():
"""Test that a None in the steps list (not last) does not affect output."""
agent = ReActAgent(
tools=[],
llm=DummyLLM(),
memory=DummyMemory(),
max_iterations=5
)
steps = [
None,
ResponseReasoningStep("Final"),
]
# Should work, since only the last step matters
codeflash_output = agent._get_response(steps); response = codeflash_output # 3.17μs -> 2.97μs (6.78% faster)
def test_none_as_last_step():
"""Test that None as last step raises AttributeError."""
agent = ReActAgent(
tools=[],
llm=DummyLLM(),
memory=DummyMemory(),
max_iterations=5
)
steps = [
ResponseReasoningStep("First"),
None,
]
with pytest.raises(AttributeError):
agent._get_response(steps) # 1.72μs -> 1.78μs (3.42% slower)
#------------------------------------------------
from typing import List
imports
import pytest # used for our unit tests
from llama_index.core.agent.legacy.react.base import ReActAgent
Minimal stubs for dependencies for testing purposes
class BaseReasoningStep:
pass
class ResponseReasoningStep(BaseReasoningStep):
def init(self, response):
self.response = response
class AgentChatResponse:
def init(self, response, sources):
self.response = response
self.sources = sources
from llama_index.core.agent.legacy.react.base import ReActAgent
unit tests
Basic Test Cases
To edit these changes
git checkout codeflash/optimize-ReActAgent._get_response-mhvc4ff0and push.