⚡️ Speed up function find_last_node by 16,160%
#213
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 16,160% (161.60x) speedup for
find_last_nodeinsrc/algorithms/graph.py⏱️ Runtime :
73.3 milliseconds→451 microseconds(best of250runs)📝 Explanation and details
The optimized code achieves a 162x speedup by eliminating a nested loop algorithm complexity issue.
Key optimization:
The original code uses
all(e["source"] != n["id"] for e in edges)inside a generator that iterates through nodes. This creates an O(N×E) algorithm where for each node, it checks against ALL edges to verify none have that node as a source.The optimized version precomputes
edge_sources = {e["source"] for e in edges}as a set once (O(E) time), then performs constant-time O(1) set membership checks withn["id"] not in edge_sourcesfor each node (O(N) time total). This reduces overall complexity from O(N×E) to O(N+E).Why this matters:
Impact analysis:
This function finds "terminal" or "sink" nodes in a directed graph (nodes with no outgoing edges). If this is called frequently in graph processing pipelines, especially with larger graphs, this optimization provides substantial performance gains. The improvement is most pronounced when the number of edges is large relative to nodes, which is common in real-world graph applications.
✅ Correctness verification report:
🌀 Click to see Generated Regression Tests
To edit these changes
git checkout codeflash/optimize-find_last_node-mjicte92and push.