⚡️ Speed up function get_child_nodes by 52%
#127
+6
−9
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 52% (0.52x) speedup for
get_child_nodesinllama-index-core/llama_index/core/node_parser/relational/hierarchical.py⏱️ Runtime :
436 microseconds→287 microseconds(best of225runs)📝 Explanation and details
The optimization achieves a 52% speedup by replacing the inefficient list-based membership testing with a set-based approach. Here are the key changes:
What was optimized:
children_idsfrom list to set - This is the critical optimization that eliminates the O(n) list membership test (candidate_node.node_id not in children_ids) that dominated runtime in the original codeset.update()instead oflist.extend()- More efficient for adding multiple items to a setWhy this creates a speedup:
The original code's bottleneck was line 33 (
if candidate_node.node_id not in children_ids) which consumed 21.3% of total runtime. With a list, each membership test requires scanning through all collected child IDs (O(n) complexity). The optimized version uses a set where membership testing is O(1) on average, dramatically reducing this overhead.Performance characteristics by test case:
The line profiler confirms this - the optimized version eliminates the expensive loop over
all_nodesin most cases, with the filtering now happening in a single efficient list comprehension that benefits from set's O(1) lookups.✅ Correctness verification report:
⚙️ Existing Unit Tests and Runtime
🌀 Generated Regression Tests and Runtime
from typing import Dict, List
imports
import pytest
from llama_index.core.node_parser.relational.hierarchical import
get_child_nodes
--- Mock Classes to simulate llama_index.core.schema.BaseNode and NodeRelationship ---
class NodeRelationship:
# Simulate Enum values
CHILD = "child"
PARENT = "parent"
SIBLING = "sibling"
class Relationship:
# Simulate relationship object with node_id
def init(self, node_id: str):
self.node_id = node_id
class BaseNode:
# Simulate a node with node_id and relationships dict
def init(self, node_id: str, relationships: Dict[str, List[Relationship]] = None):
self.node_id = node_id
self.relationships = relationships if relationships is not None else {}
from llama_index.core.node_parser.relational.hierarchical import
get_child_nodes
--- Unit Tests ---
1. Basic Test Cases
def test_single_node_with_one_child():
# Node A has child B
node_b = BaseNode("B")
node_a = BaseNode("A", relationships={NodeRelationship.CHILD: [Relationship("B")]})
all_nodes = [node_a, node_b]
# Expect B as child of A
codeflash_output = get_child_nodes([node_a], all_nodes); result = codeflash_output # 1.39μs -> 1.44μs (3.69% slower)
def test_single_node_with_multiple_children():
# Node A has children B and C
node_b = BaseNode("B")
node_c = BaseNode("C")
node_a = BaseNode("A", relationships={NodeRelationship.CHILD: [Relationship("B"), Relationship("C")]})
all_nodes = [node_a, node_b, node_c]
codeflash_output = get_child_nodes([node_a], all_nodes); result = codeflash_output # 1.15μs -> 1.16μs (0.947% slower)
def test_multiple_nodes_with_children():
# Node A has child B, Node D has child E
node_b = BaseNode("B")
node_e = BaseNode("E")
node_a = BaseNode("A", relationships={NodeRelationship.CHILD: [Relationship("B")]})
node_d = BaseNode("D", relationships={NodeRelationship.CHILD: [Relationship("E")]})
all_nodes = [node_a, node_b, node_d, node_e]
codeflash_output = get_child_nodes([node_a, node_d], all_nodes); result = codeflash_output # 1.23μs -> 1.23μs (0.244% slower)
def test_no_children_relationship():
# Node A has no CHILD relationship
node_a = BaseNode("A")
node_b = BaseNode("B")
all_nodes = [node_a, node_b]
codeflash_output = get_child_nodes([node_a], all_nodes); result = codeflash_output # 1.06μs -> 1.04μs (1.82% faster)
def test_empty_nodes_list():
# Empty input nodes
node_a = BaseNode("A")
all_nodes = [node_a]
codeflash_output = get_child_nodes([], all_nodes); result = codeflash_output # 643ns -> 715ns (10.1% slower)
def test_empty_all_nodes_list():
# Empty all_nodes
node_a = BaseNode("A", relationships={NodeRelationship.CHILD: [Relationship("B")]})
codeflash_output = get_child_nodes([node_a], []); result = codeflash_output # 948ns -> 1.01μs (6.05% slower)
def test_child_not_in_all_nodes():
# Node A has child B, but B is not in all_nodes
node_a = BaseNode("A", relationships={NodeRelationship.CHILD: [Relationship("B")]})
node_c = BaseNode("C")
all_nodes = [node_a, node_c]
codeflash_output = get_child_nodes([node_a], all_nodes); result = codeflash_output # 1.08μs -> 1.11μs (1.99% slower)
def test_duplicate_child_ids():
# Node A and D both have child B
node_b = BaseNode("B")
node_a = BaseNode("A", relationships={NodeRelationship.CHILD: [Relationship("B")]})
node_d = BaseNode("D", relationships={NodeRelationship.CHILD: [Relationship("B")]})
all_nodes = [node_a, node_b, node_d]
codeflash_output = get_child_nodes([node_a, node_d], all_nodes); result = codeflash_output # 1.35μs -> 1.13μs (19.3% faster)
2. Edge Test Cases
def test_node_with_multiple_relationship_types():
# Node A has CHILD and PARENT relationships
node_b = BaseNode("B")
node_c = BaseNode("C")
node_a = BaseNode("A", relationships={
NodeRelationship.CHILD: [Relationship("B")],
NodeRelationship.PARENT: [Relationship("C")]
})
all_nodes = [node_a, node_b, node_c]
codeflash_output = get_child_nodes([node_a], all_nodes); result = codeflash_output # 1.12μs -> 992ns (12.8% faster)
def test_node_with_empty_child_list():
# Node A has CHILD relationship but empty list
node_a = BaseNode("A", relationships={NodeRelationship.CHILD: []})
node_b = BaseNode("B")
all_nodes = [node_a, node_b]
codeflash_output = get_child_nodes([node_a], all_nodes); result = codeflash_output # 1.06μs -> 999ns (6.01% faster)
def test_all_nodes_with_no_children():
# All nodes lack CHILD relationships
node_a = BaseNode("A")
node_b = BaseNode("B")
all_nodes = [node_a, node_b]
codeflash_output = get_child_nodes([node_a, node_b], all_nodes); result = codeflash_output # 1.13μs -> 1.14μs (1.14% slower)
def test_nodes_with_non_string_node_ids():
# Node IDs are integers
node_b = BaseNode(2)
node_a = BaseNode(1, relationships={NodeRelationship.CHILD: [Relationship(2)]})
all_nodes = [node_a, node_b]
codeflash_output = get_child_nodes([node_a], all_nodes); result = codeflash_output # 1.08μs -> 974ns (11.4% faster)
def test_nodes_with_mixed_id_types():
# Node IDs are mixed types
node_b = BaseNode("B")
node_c = BaseNode(3)
node_a = BaseNode("A", relationships={NodeRelationship.CHILD: [Relationship("B"), Relationship(3)]})
all_nodes = [node_a, node_b, node_c]
codeflash_output = get_child_nodes([node_a], all_nodes); result = codeflash_output # 1.08μs -> 965ns (11.5% faster)
def test_child_id_not_unique_in_all_nodes():
# Two nodes in all_nodes have same node_id
node_b1 = BaseNode("B")
node_b2 = BaseNode("B")
node_a = BaseNode("A", relationships={NodeRelationship.CHILD: [Relationship("B")]})
all_nodes = [node_a, node_b1, node_b2]
codeflash_output = get_child_nodes([node_a], all_nodes); result = codeflash_output # 1.02μs -> 963ns (5.92% faster)
def test_node_with_none_relationships():
# Node relationships is None
node_a = BaseNode("A", relationships=None)
node_b = BaseNode("B")
all_nodes = [node_a, node_b]
codeflash_output = get_child_nodes([node_a], all_nodes); result = codeflash_output # 1.01μs -> 1.03μs (2.61% slower)
def test_node_with_unexpected_relationship_key():
# Node has a relationship key that's not CHILD
node_a = BaseNode("A", relationships={"random": [Relationship("B")]})
node_b = BaseNode("B")
all_nodes = [node_a, node_b]
codeflash_output = get_child_nodes([node_a], all_nodes); result = codeflash_output # 1.01μs -> 924ns (9.42% faster)
3. Large Scale Test Cases
def test_large_number_of_nodes_and_children():
# Create 500 parent nodes, each with one child
num_nodes = 500
child_nodes = [BaseNode(f"C{i}") for i in range(num_nodes)]
parent_nodes = [BaseNode(f"P{i}", relationships={NodeRelationship.CHILD: [Relationship(f"C{i}")]} ) for i in range(num_nodes)]
all_nodes = parent_nodes + child_nodes
codeflash_output = get_child_nodes(parent_nodes, all_nodes); result = codeflash_output # 53.6μs -> 30.2μs (77.2% faster)
def test_large_nodes_some_missing_children():
# 1000 parent nodes, only even parents have children present in all_nodes
num_nodes = 1000
child_nodes = [BaseNode(f"C{i}") for i in range(0, num_nodes, 2)]
parent_nodes = [BaseNode(f"P{i}", relationships={NodeRelationship.CHILD: [Relationship(f"C{i}")]} ) for i in range(num_nodes)]
all_nodes = parent_nodes + child_nodes
codeflash_output = get_child_nodes(parent_nodes, all_nodes); result = codeflash_output # 99.3μs -> 63.0μs (57.5% faster)
def test_large_nodes_with_duplicate_children():
# 100 nodes, all have child "X"
num_nodes = 100
child_x = BaseNode("X")
parent_nodes = [BaseNode(f"P{i}", relationships={NodeRelationship.CHILD: [Relationship("X")]} ) for i in range(num_nodes)]
all_nodes = parent_nodes + [child_x]
codeflash_output = get_child_nodes(parent_nodes, all_nodes); result = codeflash_output # 10.8μs -> 8.12μs (33.0% faster)
def test_large_nodes_no_children():
# 1000 nodes, none have CHILD relationships
nodes = [BaseNode(f"N{i}") for i in range(1000)]
all_nodes = nodes[:]
codeflash_output = get_child_nodes(nodes, all_nodes); result = codeflash_output # 86.8μs -> 62.5μs (38.9% faster)
def test_large_nodes_with_mixed_relationships():
# 500 nodes, half have CHILD, half have PARENT relationships
num_nodes = 500
child_nodes = [BaseNode(f"C{i}") for i in range(num_nodes//2)]
parent_nodes = [BaseNode(f"P{i}", relationships={NodeRelationship.CHILD: [Relationship(f"C{i}")]} ) for i in range(num_nodes//2)]
parent_nodes += [BaseNode(f"P{i}", relationships={NodeRelationship.PARENT: [Relationship(f"C{i}")]} ) for i in range(num_nodes//2, num_nodes)]
all_nodes = parent_nodes + child_nodes
codeflash_output = get_child_nodes(parent_nodes, all_nodes); result = codeflash_output # 49.7μs -> 31.7μs (56.6% faster)
codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from typing import Any, Dict, List
imports
import pytest
from llama_index.core.node_parser.relational.hierarchical import
get_child_nodes
--- Minimal stubs for BaseNode and NodeRelationship to allow standalone testing ---
class NodeRelationship:
# Simulate an Enum for relationships
CHILD = "child"
PARENT = "parent"
SIBLING = "sibling"
# Add more if needed
class Relationship:
# Represents a relationship to another node
def init(self, node_id: str):
self.node_id = node_id
class BaseNode:
def init(self, node_id: str, relationships: Dict[str, List['Relationship']] = None):
self.node_id = node_id
# relationships: Dict[relationship_type: str, List[Relationship]]
self.relationships = relationships if relationships is not None else {}
from llama_index.core.node_parser.relational.hierarchical import
get_child_nodes
------------------ UNIT TESTS ------------------
--- Basic Test Cases ---
def test_single_node_with_one_child():
# Node A has child B; all_nodes contains both
node_b = BaseNode("B")
node_a = BaseNode("A", relationships={NodeRelationship.CHILD: [Relationship("B")]})
all_nodes = [node_a, node_b]
codeflash_output = get_child_nodes([node_a], all_nodes); result = codeflash_output # 1.41μs -> 1.37μs (2.48% faster)
def test_single_node_with_multiple_children():
# Node A has children B and C
node_b = BaseNode("B")
node_c = BaseNode("C")
node_a = BaseNode("A", relationships={NodeRelationship.CHILD: [Relationship("B"), Relationship("C")]})
all_nodes = [node_a, node_b, node_c]
codeflash_output = get_child_nodes([node_a], all_nodes); result = codeflash_output # 1.18μs -> 1.09μs (8.56% faster)
def test_multiple_nodes_with_children():
# Node A has child B; Node C has child D
node_b = BaseNode("B")
node_d = BaseNode("D")
node_a = BaseNode("A", relationships={NodeRelationship.CHILD: [Relationship("B")]})
node_c = BaseNode("C", relationships={NodeRelationship.CHILD: [Relationship("D")]})
all_nodes = [node_a, node_b, node_c, node_d]
codeflash_output = get_child_nodes([node_a, node_c], all_nodes); result = codeflash_output # 1.32μs -> 1.23μs (6.90% faster)
def test_node_with_no_children():
# Node A has no children
node_a = BaseNode("A")
all_nodes = [node_a]
codeflash_output = get_child_nodes([node_a], all_nodes); result = codeflash_output # 1.06μs -> 1.08μs (1.85% slower)
def test_node_with_unrelated_relationships():
# Node A has only a PARENT relationship, not CHILD
node_a = BaseNode("A", relationships={NodeRelationship.PARENT: [Relationship("X")]})
all_nodes = [node_a]
codeflash_output = get_child_nodes([node_a], all_nodes); result = codeflash_output # 1.05μs -> 1.01μs (4.04% faster)
--- Edge Test Cases ---
def test_empty_nodes_list():
# No nodes provided
node_a = BaseNode("A", relationships={NodeRelationship.CHILD: [Relationship("B")]})
node_b = BaseNode("B")
all_nodes = [node_a, node_b]
codeflash_output = get_child_nodes([], all_nodes); result = codeflash_output # 729ns -> 691ns (5.50% faster)
def test_empty_all_nodes_list():
# No available nodes to match children
node_a = BaseNode("A", relationships={NodeRelationship.CHILD: [Relationship("B")]})
codeflash_output = get_child_nodes([node_a], []); result = codeflash_output # 1.04μs -> 1.07μs (2.53% slower)
def test_child_not_in_all_nodes():
# Node A claims child B, but B is not in all_nodes
node_a = BaseNode("A", relationships={NodeRelationship.CHILD: [Relationship("B")]})
node_c = BaseNode("C")
all_nodes = [node_a, node_c]
codeflash_output = get_child_nodes([node_a], all_nodes); result = codeflash_output # 1.14μs -> 998ns (14.2% faster)
def test_duplicate_child_relationships():
# Node A lists B as child twice
node_b = BaseNode("B")
node_a = BaseNode("A", relationships={NodeRelationship.CHILD: [Relationship("B"), Relationship("B")]})
all_nodes = [node_a, node_b]
codeflash_output = get_child_nodes([node_a], all_nodes); result = codeflash_output # 1.09μs -> 1.03μs (5.61% faster)
def test_multiple_nodes_with_overlapping_children():
# Node A and Node C both have B as a child
node_b = BaseNode("B")
node_a = BaseNode("A", relationships={NodeRelationship.CHILD: [Relationship("B")]})
node_c = BaseNode("C", relationships={NodeRelationship.CHILD: [Relationship("B")]})
all_nodes = [node_a, node_b, node_c]
codeflash_output = get_child_nodes([node_a, node_c], all_nodes); result = codeflash_output # 1.30μs -> 1.21μs (7.27% faster)
def test_nodes_with_no_relationships_key():
# Node A has relationships=None (should be handled as empty)
node_a = BaseNode("A", relationships=None)
node_b = BaseNode("B")
all_nodes = [node_a, node_b]
codeflash_output = get_child_nodes([node_a], all_nodes); result = codeflash_output # 1.04μs -> 1.01μs (2.36% faster)
def test_nodes_with_empty_relationships_dict():
# Node A has relationships as empty dict
node_a = BaseNode("A", relationships={})
node_b = BaseNode("B")
all_nodes = [node_a, node_b]
codeflash_output = get_child_nodes([node_a], all_nodes); result = codeflash_output # 1.02μs -> 1.00μs (2.20% faster)
def test_child_with_none_node_id():
# Node A has a child relationship with node_id None
node_a = BaseNode("A", relationships={NodeRelationship.CHILD: [Relationship(None)]})
node_b = BaseNode("B")
all_nodes = [node_a, node_b]
codeflash_output = get_child_nodes([node_a], all_nodes); result = codeflash_output # 970ns -> 944ns (2.75% faster)
def test_relationships_with_empty_list():
# Node A has CHILD relationship but empty list
node_a = BaseNode("A", relationships={NodeRelationship.CHILD: []})
node_b = BaseNode("B")
all_nodes = [node_a, node_b]
codeflash_output = get_child_nodes([node_a], all_nodes); result = codeflash_output # 1.05μs -> 929ns (13.0% faster)
def test_nodes_with_mixed_relationship_types():
# Node A has both CHILD and SIBLING relationships
node_b = BaseNode("B")
node_c = BaseNode("C")
node_a = BaseNode("A", relationships={
NodeRelationship.CHILD: [Relationship("B")],
NodeRelationship.SIBLING: [Relationship("C")]
})
all_nodes = [node_a, node_b, node_c]
codeflash_output = get_child_nodes([node_a], all_nodes); result = codeflash_output # 1.02μs -> 962ns (5.82% faster)
--- Large Scale Test Cases ---
def test_large_number_of_nodes_and_relationships():
# 100 nodes, each node_i has child node_{i+1}
N = 100
nodes = []
all_nodes = []
for i in range(N):
node_id = f"node_{i}"
# Each node except the last has a child
if i < N - 1:
relationships = {NodeRelationship.CHILD: [Relationship(f"node_{i+1}")]}
else:
relationships = {}
node = BaseNode(node_id, relationships=relationships)
nodes.append(node)
all_nodes.append(node)
# Get all children of all nodes except the last
codeflash_output = get_child_nodes(nodes[:-1], all_nodes); result = codeflash_output # 10.8μs -> 7.88μs (36.4% faster)
expected_ids = {f"node_{i+1}" for i in range(N-1)}
def test_large_fanout():
# One node with 999 children
parent = BaseNode("parent", relationships={NodeRelationship.CHILD: [Relationship(f"child_{i}") for i in range(999)]})
children = [BaseNode(f"child_{i}") for i in range(999)]
all_nodes = [parent] + children
codeflash_output = get_child_nodes([parent], all_nodes); result = codeflash_output # 22.0μs -> 1.52μs (1345% faster)
def test_large_number_of_unrelated_nodes():
# 500 nodes, none are children of any other
all_nodes = [BaseNode(f"node_{i}") for i in range(500)]
codeflash_output = get_child_nodes(all_nodes, all_nodes); result = codeflash_output # 46.5μs -> 33.3μs (39.8% faster)
def test_large_number_of_nodes_with_some_missing_children():
# 100 nodes, each claims a child node_{i+100}, but only first 50 children exist
nodes = []
all_nodes = []
for i in range(100):
child_id = f"node_{i+100}"
relationships = {NodeRelationship.CHILD: [Relationship(child_id)]}
node = BaseNode(f"node_{i}", relationships=relationships)
nodes.append(node)
all_nodes.append(node)
# Add only first 50 children to all_nodes
for i in range(100, 150):
all_nodes.append(BaseNode(f"node_{i}"))
codeflash_output = get_child_nodes(nodes, all_nodes); result = codeflash_output # 11.9μs -> 8.00μs (48.4% faster)
# Only first 50 children should be returned
expected_ids = {f"node_{i}" for i in range(100, 150)}
def test_large_scale_duplicate_children():
# 100 parents, each claims same 10 children
children = [BaseNode(f"child_{i}") for i in range(10)]
parents = [BaseNode(f"parent_{i}", relationships={NodeRelationship.CHILD: [Relationship(f"child_{j}") for j in range(10)]}) for i in range(100)]
all_nodes = parents + children
codeflash_output = get_child_nodes(parents, all_nodes); result = codeflash_output # 11.4μs -> 8.05μs (41.3% faster)
codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
To edit these changes
git checkout codeflash/optimize-get_child_nodes-mhv9jykuand push.