From 025cfec7efd6e476f1a82947eddf91234d6c6cb9 Mon Sep 17 00:00:00 2001 From: "codeflash-ai[bot]" <148906541+codeflash-ai[bot]@users.noreply.github.com> Date: Tue, 23 Dec 2025 08:59:55 +0000 Subject: [PATCH] Optimize find_last_node MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The optimized code achieves a **162x speedup** by eliminating a nested loop algorithm complexity issue. **Key optimization:** The original code uses `all(e["source"] != n["id"] for e in edges)` inside a generator that iterates through nodes. This creates an O(N×E) algorithm where for each node, it checks against ALL edges to verify none have that node as a source. The optimized version precomputes `edge_sources = {e["source"] for e in edges}` as a set once (O(E) time), then performs constant-time O(1) set membership checks with `n["id"] not in edge_sources` for each node (O(N) time total). This reduces overall complexity from O(N×E) to O(N+E). **Why this matters:** - **Set lookup is O(1) vs list iteration O(E)**: Hash-based set membership is dramatically faster than scanning through all edges repeatedly - **Single pass over edges**: The set is built once upfront rather than scanning edges for every node candidate - **Scales exceptionally well**: Test results show the optimization shines with larger graphs: - Small graphs (2-3 nodes): 60-80% faster - Large linear chain (1000 nodes): **32,000%+ faster** (18.3ms → 56.9μs) - Large fully connected graph (100 nodes, 9900 edges): **8,600%+ faster** (17.1ms → 196μs) **Impact analysis:** This function finds "terminal" or "sink" nodes in a directed graph (nodes with no outgoing edges). If this is called frequently in graph processing pipelines, especially with larger graphs, this optimization provides substantial performance gains. The improvement is most pronounced when the number of edges is large relative to nodes, which is common in real-world graph applications. --- src/algorithms/graph.py | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/algorithms/graph.py b/src/algorithms/graph.py index 777ea3b..e686a34 100644 --- a/src/algorithms/graph.py +++ b/src/algorithms/graph.py @@ -47,7 +47,8 @@ def find_shortest_path(self, start: str, end: str) -> list[str]: def find_last_node(nodes, edges): """This function receives a flow and returns the last node.""" - return next((n for n in nodes if all(e["source"] != n["id"] for e in edges)), None) + edge_sources = {e["source"] for e in edges} + return next((n for n in nodes if n["id"] not in edge_sources), None) def find_leaf_nodes(nodes: list[dict], edges: list[dict]) -> list[dict]: