⚡️ Speed up method `FunctionRanker.get_function_stats_summary` by 2,062% in PR #970 (`ranking-changes`) #971

codeflash-ai · 2025-12-14T17:16:46Z

⚡️ This pull request contains optimizations for PR #970

If you approve this dependent PR, these changes will be merged into the original PR branch ranking-changes.

This PR will be automatically closed if the original PR is merged.

📄 2,062% (20.62x) speedup for `FunctionRanker.get_function_stats_summary` in `codeflash/benchmarking/function_ranker.py`

⏱️ Runtime : 1.48 milliseconds → 68.6 microseconds (best of 115 runs)

📝 Explanation and details

The optimization replaces an O(N) linear search through all functions with an O(1) hash table lookup followed by iteration over only matching function names.

Key Changes:

Added _function_stats_by_name index in __init__ that maps function names to lists of (key, stats) tuples
Modified get_function_stats_summary to first lookup candidates by function name, then iterate only over those candidates

Why This is Faster:
The original code iterates through ALL function stats (22,603 iterations in the profiler results) for every lookup. The optimized version uses a hash table to instantly find only the functions with matching names, then iterates through just those candidates (typically 1-2 functions).

Performance Impact:

Small datasets: 15-30% speedup as shown in basic test cases
Large datasets: Dramatic improvement - the test_large_scale_performance case with 900 functions shows 3085% speedup (66.7μs → 2.09μs)
Overall benchmark: 2061% speedup demonstrates the optimization scales excellently with dataset size

When This Optimization Shines:

Large codebases with many profiled functions (where the linear search becomes expensive)
Repeated function lookups (if this method is called frequently)
Cases with many unique function names but few duplicates per name

The optimization maintains identical behavior while transforming the algorithm from O(N) per lookup to O(average functions per name) per lookup, which is typically O(1) in practice.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 40 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

🌀 Generated Regression Tests and Runtime

from pathlib import Path

# imports


# function to test
# (The FunctionRanker class is defined above.)

# --- Helper Classes and Functions ---


class DummyLogger:
    """A dummy logger to replace the real logger for testing."""

    def debug(self, msg):
        pass

    def warning(self, msg):
        pass


class DummyProfileStats:
    """A dummy ProfileStats to inject profiling stats for testing."""

    def __init__(self, stats):
        self.stats = stats


class DummyFunctionToOptimize:
    """A simple stand-in for FunctionToOptimize."""

    def __init__(self, function_name, file_path):
        self.function_name = function_name
        self.file_path = Path(file_path)


# Patch the FunctionRanker to inject dummy stats and logger
def make_ranker_with_stats(stats_dict):
    # Patch logger and ProfileStats
    import codeflash.benchmarking.function_ranker as fr

    fr.logger = DummyLogger()
    fr.ProfileStats = lambda _: DummyProfileStats(stats_dict)
    # Patch is_pytest_infrastructure to always return False
    fr.is_pytest_infrastructure = lambda filename, func_name: False
    # Create FunctionRanker with dummy trace path
    return fr.FunctionRanker(Path("dummy/path/trace.prof"))


# --- Basic Test Cases ---


def test_basic_multiple_functions():
    """Test with multiple functions, only one matches."""
    stats = {
        ("foo.py", 10, "my_func"): (2, 1, 200, 300, {}),
        ("foo.py", 20, "other_func"): (3, 1, 300, 400, {}),
        ("bar.py", 5, "my_func"): (1, 1, 100, 120, {}),
    }
    ranker = make_ranker_with_stats(stats)
    func = DummyFunctionToOptimize("my_func", "foo.py")
    codeflash_output = ranker.get_function_stats_summary(func)
    summary = codeflash_output  # 2.18μs -> 1.89μs (15.3% faster)


def test_basic_function_name_collision():
    """Test with two functions of same name in different files."""
    stats = {("foo.py", 10, "dup_func"): (1, 1, 10, 15, {}), ("bar.py", 20, "dup_func"): (2, 1, 20, 30, {})}
    ranker = make_ranker_with_stats(stats)
    func_foo = DummyFunctionToOptimize("dup_func", "foo.py")
    func_bar = DummyFunctionToOptimize("dup_func", "bar.py")
    codeflash_output = ranker.get_function_stats_summary(func_foo)
    summary_foo = codeflash_output  # 2.17μs -> 1.75μs (24.0% faster)
    codeflash_output = ranker.get_function_stats_summary(func_bar)
    summary_bar = codeflash_output  # 1.41μs -> 1.07μs (31.8% faster)


# --- Edge Test Cases ---


def test_edge_function_not_found():
    """Test when the function is not present in stats."""
    stats = {("foo.py", 10, "func_a"): (1, 1, 10, 15, {})}
    ranker = make_ranker_with_stats(stats)
    func = DummyFunctionToOptimize("missing_func", "foo.py")
    codeflash_output = ranker.get_function_stats_summary(func)
    summary = codeflash_output  # 2.05μs -> 1.59μs (28.9% faster)


def test_edge_empty_stats():
    """Test with empty stats dict."""
    stats = {}
    ranker = make_ranker_with_stats(stats)
    func = DummyFunctionToOptimize("any_func", "foo.py")
    codeflash_output = ranker.get_function_stats_summary(func)
    summary = codeflash_output  # 1.60μs -> 1.38μs (16.0% faster)


def test_edge_zero_call_count():
    """Test that functions with zero call_count are skipped."""
    stats = {("foo.py", 10, "func_a"): (0, 1, 0, 0, {}), ("foo.py", 20, "func_b"): (3, 1, 30, 45, {})}
    ranker = make_ranker_with_stats(stats)
    func_a = DummyFunctionToOptimize("func_a", "foo.py")
    func_b = DummyFunctionToOptimize("func_b", "foo.py")
    codeflash_output = ranker.get_function_stats_summary(func_a)
    summary_a = codeflash_output  # 1.88μs -> 1.44μs (30.5% faster)
    codeflash_output = ranker.get_function_stats_summary(func_b)
    summary_b = codeflash_output  # 1.31μs -> 1.30μs (0.768% faster)


def test_edge_function_in_subdirectory():
    """Test matching when file is in a subdirectory."""
    stats = {("src/foo.py", 10, "func_c"): (1, 1, 10, 20, {})}
    ranker = make_ranker_with_stats(stats)
    func = DummyFunctionToOptimize("func_c", "foo.py")
    codeflash_output = ranker.get_function_stats_summary(func)
    summary = codeflash_output  # 2.15μs -> 1.82μs (18.1% faster)


def test_edge_function_with_dot_in_name():
    """Test function names with dots but not class methods."""
    stats = {("foo.py", 10, "func.with.dot"): (2, 1, 20, 30, {})}
    ranker = make_ranker_with_stats(stats)
    func = DummyFunctionToOptimize("func.with.dot", "foo.py")
    codeflash_output = ranker.get_function_stats_summary(func)
    summary = codeflash_output  # 1.94μs -> 1.52μs (27.7% faster)


def test_edge_function_name_partial_match():
    """Test that partial matches do not return results."""
    stats = {("foo.py", 10, "my_func_special"): (1, 1, 10, 15, {})}
    ranker = make_ranker_with_stats(stats)
    func = DummyFunctionToOptimize("my_func", "foo.py")
    codeflash_output = ranker.get_function_stats_summary(func)
    summary = codeflash_output  # 1.89μs -> 1.45μs (30.4% faster)


def test_edge_multiple_functions_same_file():
    """Test with multiple functions of same name in same file."""
    stats = {("foo.py", 10, "func_x"): (1, 1, 10, 15, {}), ("foo.py", 20, "func_x"): (2, 1, 20, 30, {})}
    ranker = make_ranker_with_stats(stats)
    func = DummyFunctionToOptimize("func_x", "foo.py")
    codeflash_output = ranker.get_function_stats_summary(func)
    summary = codeflash_output  # 2.15μs -> 1.91μs (12.6% faster)


# --- Large Scale Test Cases ---


def test_large_scale_collision_and_uniqueness():
    """Test with many functions with same name but different files."""
    stats = {}
    for i in range(300):
        stats[(f"dir_{i}/foo.py", 10, "common_func")] = (i + 2, 1, 50 + i, 100 + i, {})
    ranker = make_ranker_with_stats(stats)
    # Each file should match only its own version
    for i in [0, 50, 299]:
        func = DummyFunctionToOptimize("common_func", "foo.py")
        # Should match any with foo.py in key
        codeflash_output = ranker.get_function_stats_summary(func)
        summary = codeflash_output  # 4.16μs -> 3.74μs (11.3% faster)


def test_large_scale_performance():
    """Test that function works efficiently with large stats."""
    stats = {}
    for i in range(900):
        stats[("foo.py", i, f"func_{i}")] = (i + 1, 1, 100 + i, 200 + i, {})
    ranker = make_ranker_with_stats(stats)
    func = DummyFunctionToOptimize("func_899", "foo.py")
    codeflash_output = ranker.get_function_stats_summary(func)
    summary = codeflash_output  # 66.7μs -> 2.09μs (3085% faster)

To edit these changes git checkout codeflash/optimize-pr970-2025-12-14T17.16.41 and push.

The optimization replaces an O(N) linear search through all functions with an O(1) hash table lookup followed by iteration over only matching function names. **Key Changes:** - Added `_function_stats_by_name` index in `__init__` that maps function names to lists of (key, stats) tuples - Modified `get_function_stats_summary` to first lookup candidates by function name, then iterate only over those candidates **Why This is Faster:** The original code iterates through ALL function stats (22,603 iterations in the profiler results) for every lookup. The optimized version uses a hash table to instantly find only the functions with matching names, then iterates through just those candidates (typically 1-2 functions). **Performance Impact:** - **Small datasets**: 15-30% speedup as shown in basic test cases - **Large datasets**: Dramatic improvement - the `test_large_scale_performance` case with 900 functions shows **3085% speedup** (66.7μs → 2.09μs) - **Overall benchmark**: 2061% speedup demonstrates the optimization scales excellently with dataset size **When This Optimization Shines:** - Large codebases with many profiled functions (where the linear search becomes expensive) - Repeated function lookups (if this method is called frequently) - Cases with many unique function names but few duplicates per name The optimization maintains identical behavior while transforming the algorithm from O(N) per lookup to O(average functions per name) per lookup, which is typically O(1) in practice.

* Consolidate FunctionRanker: merge rank/rerank/filter methods into single rank_functions * calculate in own file time remove unittests remnants * implement suggestions * cleanup code * let's make it clear it's an sqlite3 db * forgot this one * cleanup * tessl add * improve filtering * cleanup * Optimize FunctionRanker.get_function_stats_summary (#971) The optimization replaces an O(N) linear search through all functions with an O(1) hash table lookup followed by iteration over only matching function names. **Key Changes:** - Added `_function_stats_by_name` index in `__init__` that maps function names to lists of (key, stats) tuples - Modified `get_function_stats_summary` to first lookup candidates by function name, then iterate only over those candidates **Why This is Faster:** The original code iterates through ALL function stats (22,603 iterations in the profiler results) for every lookup. The optimized version uses a hash table to instantly find only the functions with matching names, then iterates through just those candidates (typically 1-2 functions). **Performance Impact:** - **Small datasets**: 15-30% speedup as shown in basic test cases - **Large datasets**: Dramatic improvement - the `test_large_scale_performance` case with 900 functions shows **3085% speedup** (66.7μs → 2.09μs) - **Overall benchmark**: 2061% speedup demonstrates the optimization scales excellently with dataset size **When This Optimization Shines:** - Large codebases with many profiled functions (where the linear search becomes expensive) - Repeated function lookups (if this method is called frequently) - Cases with many unique function names but few duplicates per name The optimization maintains identical behavior while transforming the algorithm from O(N) per lookup to O(average functions per name) per lookup, which is typically O(1) in practice. Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com> * Revert "let's make it clear it's an sqlite3 db" This reverts commit 713f135. * cleanup trace file * cleanup * addressable time * Optimize TestResults.add The optimization applies **local variable caching** to eliminate repeated attribute lookups on `self.test_result_idx` and `self.test_results`. **Key Changes:** - Added `test_result_idx = self.test_result_idx` and `test_results = self.test_results` to cache references locally - Used these local variables instead of accessing `self.*` attributes multiple times **Why This Works:** In Python, attribute access (e.g., `self.test_result_idx`) involves dictionary lookups in the object's `__dict__`, which is slower than accessing local variables. By caching these references, we eliminate redundant attribute resolution overhead on each access. **Performance Impact:** The line profiler shows the optimization reduces total execution time from 12.771ms to 19.482ms in the profiler run, but the actual runtime improved from 2.13ms to 1.89ms (12% speedup). The test results consistently show 10-20% improvements across various scenarios, particularly benefiting: - Large-scale operations (500+ items): 14-16% faster - Multiple unique additions: 15-20% faster - Mixed workloads with duplicates: 7-15% faster **Real-World Benefits:** This optimization is especially valuable for high-frequency test result collection scenarios where the `add` method is called repeatedly in tight loops, as the cumulative effect of eliminating attribute lookups becomes significant at scale. * bugfix * cleanup * type checks * pre-commit * ⚡️ Speed up function `get_cached_gh_event_data` by 13% (#975) * Optimize get_cached_gh_event_data The optimization replaces `Path(event_path).open(encoding="utf-8")` with the built-in `open(event_path, encoding="utf-8")`, achieving a **12% speedup** by eliminating unnecessary object allocation overhead. **Key optimization:** - **Removed Path object creation**: The original code creates a `pathlib.Path` object just to call `.open()` on it, when the built-in `open()` function can directly accept the string path from `event_path`. - **Reduced memory allocation**: Avoiding the intermediate `Path` object saves both allocation time and memory overhead. **Why this works:** In Python, `pathlib.Path().open()` internally calls the same file opening mechanism as the built-in `open()`, but with additional overhead from object instantiation and method dispatch. Since `event_path` is already a string from `os.getenv()`, passing it directly to `open()` is more efficient. **Performance impact:** The test results show consistent improvements across all file-reading scenarios: - Simple JSON files: 12-20% faster - Large files (1000+ elements): 3-27% faster - Error cases (missing files): Up to 71% faster - The cached calls remain unaffected (0% change as expected) **Workload benefits:** Based on the function references, `get_cached_gh_event_data()` is called by multiple GitHub-related utility functions (`get_pr_number()`, `is_repo_a_fork()`, `is_pr_draft()`). While the `@lru_cache(maxsize=1)` means the file is only read once per program execution, this optimization reduces the initial cold-start latency for GitHub Actions workflows or CI/CD pipelines where these functions are commonly used. The optimization is particularly effective for larger JSON files and error handling scenarios, making it valuable for robust CI/CD environments that may encounter various file conditions. * ignore --------- Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com> Co-authored-by: Kevin Turcios <[email protected]> * ⚡️ Speed up function `function_is_a_property` by 60% (#974) * Optimize function_is_a_property The optimized version achieves a **60% speedup** by replacing Python's `any()` generator expression with a manual loop and making three key micro-optimizations: **What was optimized:** 1. **Replaced `isinstance()` with `type() is`**: Direct type comparison (`type(node) is ast_Name`) is faster than `isinstance(node, ast.Name)` for AST nodes where subclassing is rare 2. **Eliminated repeated lookups**: Cached `"property"` as `property_id` and `ast.Name` as `ast_Name` in local variables to avoid global/attribute lookups in the loop 3. **Manual loop with early return**: Replaced `any()` generator with explicit `for` loop that returns `True` immediately upon finding a match, avoiding generator overhead **Why it's faster:** - The `any()` function creates generator machinery that adds overhead, especially for small decorator lists - `isinstance()` performs multiple checks while `type() is` does a single identity comparison - Local variable access is significantly faster than repeated global/attribute lookups in tight loops **Performance characteristics from tests:** - **Small decorator lists** (1-3 decorators): 50-80% faster due to reduced per-iteration overhead - **Large decorator lists** (1000+ decorators): 55-60% consistent speedup, with early termination providing additional benefits when `@property` appears early - **Empty decorator lists**: 77% faster due to avoiding `any()` generator setup entirely **Impact on workloads:** Based on the function references, this function is called during AST traversal in `visit_FunctionDef` and `visit_AsyncFunctionDef` methods - likely part of a code analysis pipeline that processes many functions. The 60% speedup will be particularly beneficial when analyzing codebases with many decorated functions, as this optimization reduces overhead in a hot path that's called once per function definition. * format --------- Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com> Co-authored-by: Kevin Turcios <[email protected]> * Optimize function_is_a_property (#976) The optimization achieves an **11% speedup** through two key changes: **1. Constant Hoisting:** The original code repeatedly assigns `property_id = "property"` and `ast_name = ast.Name` on every function call. The optimized version moves these to module-level constants `_property_id` and `_ast_name`, eliminating 4,130 redundant assignments per profiling run (saving ~2.12ms total time). **2. isinstance() vs type() comparison:** Replaced `type(node) is ast_name` with `isinstance(node, _ast_name)`. While both are correct for AST nodes (which use single inheritance), `isinstance()` is slightly more efficient for type checking in Python's implementation. **Performance Impact:** The function is called in AST traversal loops when discovering functions to optimize (`visit_FunctionDef` and `visit_AsyncFunctionDef`). Since these visitors process entire codebases, the 11% per-call improvement compounds significantly across large projects. **Test Case Performance:** The optimization shows consistent gains across all test scenarios: - **Simple cases** (no decorators): 29-42% faster due to eliminated constant assignments - **Property detection cases**: 11-26% faster from combined optimizations - **Large-scale tests** (500-1000 functions): 18.5% faster, demonstrating the cumulative benefit when processing many functions The optimizations are particularly effective for codebases with many function definitions, where this function gets called repeatedly during AST analysis. Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com> * Address PR review comments - Add mkdir for test file directory to prevent FileNotFoundError - Use addressable_time_ns for importance filtering instead of own_time_ns - Remove unnecessary list() wrappers in make_pstats_compatible - Remove old .sqlite3 file with wrong extension Co-Authored-By: Warp <[email protected]> * Check addressable_time_ns instead of own_time_ns for filtering This ensures we consider functions that may have low own_time but high time in first-order dependent functions (callees). Co-Authored-By: Warp <[email protected]> --------- Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com> Co-authored-by: Saurabh Misra <[email protected]> Co-authored-by: Warp <[email protected]>

codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Dec 14, 2025

codeflash-ai bot mentioned this pull request Dec 14, 2025

tracer improvements #970

Merged

KRRT7 merged commit e0d8900 into ranking-changes Dec 14, 2025
20 of 23 checks passed

KRRT7 deleted the codeflash/optimize-pr970-2025-12-14T17.16.41 branch December 14, 2025 17:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up method `FunctionRanker.get_function_stats_summary` by 2,062% in PR #970 (`ranking-changes`) #971

⚡️ Speed up method `FunctionRanker.get_function_stats_summary` by 2,062% in PR #970 (`ranking-changes`) #971

Uh oh!

codeflash-ai bot commented Dec 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

⚡️ Speed up method FunctionRanker.get_function_stats_summary by 2,062% in PR #970 (ranking-changes) #971

⚡️ Speed up method FunctionRanker.get_function_stats_summary by 2,062% in PR #970 (ranking-changes) #971

Uh oh!

Conversation

codeflash-ai bot commented Dec 14, 2025

⚡️ This pull request contains optimizations for PR #970

📄 2,062% (20.62x) speedup for FunctionRanker.get_function_stats_summary in codeflash/benchmarking/function_ranker.py

📝 Explanation and details

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

⚡️ Speed up method `FunctionRanker.get_function_stats_summary` by 2,062% in PR #970 (`ranking-changes`) #971

⚡️ Speed up method `FunctionRanker.get_function_stats_summary` by 2,062% in PR #970 (`ranking-changes`) #971

📄 2,062% (20.62x) speedup for `FunctionRanker.get_function_stats_summary` in `codeflash/benchmarking/function_ranker.py`