⚡️ Speed up function `speedup_critic` by 15% in PR #555 (`refinement`) #557

codeflash-ai · 2025-07-17T21:19:56Z

⚡️ This pull request contains optimizations for PR #555

If you approve this dependent PR, these changes will be merged into the original PR branch refinement.

This PR will be automatically closed if the original PR is merged.

📄 15% (0.15x) speedup for `speedup_critic` in `codeflash/result/critic.py`

⏱️ Runtime : 1.84 milliseconds → 1.60 milliseconds (best of 56 runs)

📝 Explanation and details

Here’s an optimized version that preserves all existing function signatures, logic, and return values but reduces unnecessary overhead, short-circuits early, and eliminates redundant object lookups and function calls.

Key Optimizations:

Use local variable binding early in get_pr_number to avoid repeated imports/GL lookups for get_cached_gh_event_data.
Inline the import of get_cached_gh_event_data once at the top—doing so locally in the function is much slower.
Use early returns in speedup_critic after fast checks to avoid unnecessary branches and function calls.
Remove unneeded bool() wrappers where the result is already bool.
Use direct access to already-imported functions instead of accessing via module (inlining env_utils.get_pr_number).

Summary:
All function return values and signatures are preserved. Redundant lookups are eliminated, external calls are reduced, and fast-path branches short-circuit unnecessary logic to reduce overall runtime and memory allocations. Comments are preserved unless the associated code was optimized.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	✅ 6 Passed
🌀 Generated Regression Tests	✅ 4018 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

⚙️ Existing Unit Tests and Runtime

Test File::Test Function	Original ⏱️	Optimized ⏱️	Speedup
`test_critic.py::test_speedup_critic`	3.69μs	3.48μs	✅6.07%

🌀 Generated Regression Tests and Runtime

from __future__ import annotations

import json
import os
from functools import lru_cache
from pathlib import Path
from typing import Any, Optional

# imports
import pytest  # used for our unit tests
from codeflash.result.critic import speedup_critic

# Dummy MIN_IMPROVEMENT_THRESHOLD for testing
MIN_IMPROVEMENT_THRESHOLD = 0.01

# Dummy env_utils for testing
class DummyEnvUtils:
    _pr_number = None

    @classmethod
    def set_pr_number(cls, value):
        cls._pr_number = value

    @classmethod
    def get_pr_number(cls):
        return cls._pr_number

# Patch env_utils to our dummy for tests
env_utils = DummyEnvUtils

# Dummy OptimizedCandidateResult for testing
class OptimizedCandidateResult:
    def __init__(self, best_test_runtime: int):
        self.best_test_runtime = best_test_runtime
from codeflash.result.critic import speedup_critic





def test_basic_negative_improvement():
    """
    Test case where the optimization is actually slower.
    """
    orig = 100_000
    opt = 120_000
    candidate = OptimizedCandidateResult(opt)
    codeflash_output = speedup_critic(candidate, orig, None) # 1.88μs -> 1.61μs (16.8% faster)

def test_best_runtime_until_now_none_vs_value():
    """
    Test that best_runtime_until_now is handled correctly.
    """
    orig = 100_000
    opt = 80_000
    candidate = OptimizedCandidateResult(opt)
    # best_runtime_until_now is None, so only perf_gain > noise_floor matters
    codeflash_output = speedup_critic(candidate, orig, None) # 1.81μs -> 1.62μs (11.8% faster)
    # best_runtime_until_now is worse (higher), so should still be True
    codeflash_output = speedup_critic(candidate, orig, 90_000) # 942ns -> 832ns (13.2% faster)
    # best_runtime_until_now is better (lower), so should be False
    codeflash_output = speedup_critic(candidate, orig, 70_000) # 601ns -> 460ns (30.7% faster)

def test_disable_gh_action_noise_flag():
    """
    Test that disabling the GH Action noise disables the noise floor doubling.
    """
    orig = 5_000  # < 10_000, so noise_floor = 3*MIN_IMPROVEMENT_THRESHOLD
    opt = 4_950  # Less than 1% improvement
    candidate = OptimizedCandidateResult(opt)
    env_utils.set_pr_number(123)  # Simulate GH Actions
    # With disable_gh_action_noise, should use lower noise floor
    codeflash_output = speedup_critic(candidate, orig, None, disable_gh_action_noise=True) # 1.79μs -> 1.80μs (0.555% slower)
    # Without disabling, noise floor is doubled, so improvement is not enough
    codeflash_output = speedup_critic(candidate, orig, None, disable_gh_action_noise=False) # 1.44μs -> 1.22μs (18.0% faster)

# ------------------------
# Edge Test Cases
# ------------------------



def test_edge_both_runtimes_zero():
    """
    Test edge case where both runtimes are zero.
    """
    orig = 0
    opt = 0
    candidate = OptimizedCandidateResult(opt)
    codeflash_output = speedup_critic(candidate, orig, None) # 1.67μs -> 1.51μs (10.6% faster)

def test_edge_runtime_just_below_noise_floor():
    """
    Test where improvement is just below the noise floor.
    """
    orig = 10_000
    # MIN_IMPROVEMENT_THRESHOLD = 0.01, so need perf_gain <= 0.01 to fail
    opt = int(orig / (1 + MIN_IMPROVEMENT_THRESHOLD)) + 1  # slightly worse than threshold
    candidate = OptimizedCandidateResult(opt)
    codeflash_output = speedup_critic(candidate, orig, None) # 1.64μs -> 1.45μs (13.1% faster)

def test_edge_runtime_just_above_noise_floor():
    """
    Test where improvement is just above the noise floor.
    """
    orig = 10_000
    # MIN_IMPROVEMENT_THRESHOLD = 0.01, so need perf_gain > 0.01 to pass
    opt = int(orig / (1 + MIN_IMPROVEMENT_THRESHOLD))  # just at threshold
    candidate = OptimizedCandidateResult(opt)
    codeflash_output = speedup_critic(candidate, orig, None) # 1.64μs -> 1.30μs (26.1% faster)

def test_edge_small_original_runtime_noise_floor():
    """
    Test noise floor logic when original runtime is below 10_000.
    """
    orig = 9_000  # < 10_000
    # noise_floor = 3*MIN_IMPROVEMENT_THRESHOLD = 0.03
    opt = int(orig / (1 + 0.03))  # just at threshold
    candidate = OptimizedCandidateResult(opt)
    codeflash_output = speedup_critic(candidate, orig, None) # 1.80μs -> 1.64μs (9.80% faster)

def test_edge_github_actions_noise_floor():
    """
    Test that noise floor is doubled in GH Actions mode.
    """
    orig = 9_000
    # noise_floor = 3*MIN_IMPROVEMENT_THRESHOLD = 0.03, doubled = 0.06
    env_utils.set_pr_number(42)
    opt = int(orig / (1 + 0.06))  # just at threshold
    candidate = OptimizedCandidateResult(opt)
    codeflash_output = speedup_critic(candidate, orig, None) # 1.84μs -> 1.56μs (18.0% faster)
    # Now, make improvement just below doubled noise floor
    opt = int(orig / (1 + 0.059)) + 1
    candidate = OptimizedCandidateResult(opt)
    codeflash_output = speedup_critic(candidate, orig, None) # 872ns -> 822ns (6.08% faster)

def test_edge_best_runtime_equal_to_candidate():
    """
    Test when best_runtime_until_now equals candidate's best_test_runtime.
    """
    orig = 100_000
    opt = 80_000
    candidate = OptimizedCandidateResult(opt)
    # perf_gain is above threshold, but candidate is not better than best_runtime_until_now
    codeflash_output = speedup_critic(candidate, orig, 80_000) # 1.88μs -> 1.69μs (11.2% faster)

# ------------------------
# Large Scale Test Cases
# ------------------------


def test_large_scale_runtime_near_threshold():
    """
    Test with many candidates whose improvements are near the threshold.
    """
    orig = 1_000_000
    threshold = MIN_IMPROVEMENT_THRESHOLD
    # Candidate runtimes just at and just below the threshold
    opt_just_above = int(orig / (1 + threshold))
    opt_just_below = int(orig / (1 + threshold)) + 2
    candidates = [OptimizedCandidateResult(opt_just_above) for _ in range(500)]
    candidates += [OptimizedCandidateResult(opt_just_below) for _ in range(500)]
    for i, candidate in enumerate(candidates):
        if i < 500:
            codeflash_output = speedup_critic(candidate, orig, None)
        else:
            codeflash_output = speedup_critic(candidate, orig, None)

def test_large_scale_github_actions_mode():
    """
    Test large scale with GH Actions noise floor enabled.
    """
    env_utils.set_pr_number(100)
    orig = 1_000_000
    # noise_floor = 0.01 * 2 = 0.02
    opt_just_above = int(orig / (1 + 0.021))
    opt_just_below = int(orig / (1 + 0.019)) + 1
    candidates = [OptimizedCandidateResult(opt_just_above) for _ in range(500)]
    candidates += [OptimizedCandidateResult(opt_just_below) for _ in range(500)]
    for i, candidate in enumerate(candidates):
        if i < 500:
            codeflash_output = speedup_critic(candidate, orig, None)
        else:
            codeflash_output = speedup_critic(candidate, orig, None)



from __future__ import annotations

import json
import os
from functools import lru_cache
from pathlib import Path
from typing import Any, Optional

# imports
import pytest
from codeflash.result.critic import speedup_critic

# --- Begin stubs and constants for isolated testing ---
# Since we don't have the actual codeflash modules, we define minimal stubs/mocks for testing

# Simulate the MIN_IMPROVEMENT_THRESHOLD constant
MIN_IMPROVEMENT_THRESHOLD = 0.01  # 1% improvement threshold

# Simulate OptimizedCandidateResult dataclass
class OptimizedCandidateResult:
    def __init__(self, best_test_runtime: int):
        self.best_test_runtime = best_test_runtime
from codeflash.result.critic import speedup_critic

# unit tests

# --- Basic Test Cases ---















def test_edge_disable_gh_action_noise_small_runtime(monkeypatch):
    """Test: disable_gh_action_noise disables doubling even for small runtimes."""
    monkeypatch.setattr(env_utils, "get_pr_number", lambda: 123)
    orig = 9_000
    opt = int(orig * 0.96)  # 4% improvement, noise floor is 3% (not doubled)
    candidate = OptimizedCandidateResult(best_test_runtime=opt)
    codeflash_output = speedup_critic(candidate, orig, None, disable_gh_action_noise=True)

# --- Large Scale Test Cases ---

@pytest.mark.parametrize("orig,opt,prev_best,expected", [
    # 1000 cases, all above threshold
    (100_000, 90_000, None, True),
    (100_000, 89_000, 90_000, True),
    (100_000, 89_000, 89_500, True),
    (100_000, 95_000, 94_000, False),  # Not better than prev_best
    (100_000, 99_500, None, False),    # Below threshold
])
def test_large_scale_varied(monkeypatch, orig, opt, prev_best, expected):
    """Test: Large scale, multiple scenarios in one parametrized test."""
    monkeypatch.setattr(env_utils, "get_pr_number", lambda: None)
    candidate = OptimizedCandidateResult(best_test_runtime=opt)
    codeflash_output = speedup_critic(candidate, orig, prev_best)

def test_large_scale_many_candidates(monkeypatch):
    """Test: Simulate a batch of 500 candidates, all with small random improvements."""
    import random
    monkeypatch.setattr(env_utils, "get_pr_number", lambda: None)
    orig = 100_000
    # Generate 500 candidates with improvement between 0.5% and 2%
    count_pass = 0
    for i in range(500):
        improvement = random.uniform(0.005, 0.02)
        opt = int(orig * (1 - improvement))
        candidate = OptimizedCandidateResult(best_test_runtime=opt)
        codeflash_output = speedup_critic(candidate, orig, None); result = codeflash_output
        # Should only pass if improvement > MIN_IMPROVEMENT_THRESHOLD (1%)
        if improvement > MIN_IMPROVEMENT_THRESHOLD:
            count_pass += 1
        else:
            pass

To edit these changes git checkout codeflash/optimize-pr555-2025-07-17T21.19.50 and push.

Here’s an optimized version that preserves all existing function signatures, logic, and return values but reduces unnecessary overhead, short-circuits early, and eliminates redundant object lookups and function calls. **Key Optimizations:** - Use local variable binding early in `get_pr_number` to avoid repeated imports/GL lookups for `get_cached_gh_event_data`. - Inline the import of `get_cached_gh_event_data` once at the top—doing so locally in the function is much slower. - Use early returns in `speedup_critic` after fast checks to avoid unnecessary branches and function calls. - Remove unneeded bool() wrappers where the result is already bool. - Use direct access to already-imported functions instead of accessing via module (inlining `env_utils.get_pr_number`). **Summary**: All function return values and signatures are preserved. Redundant lookups are eliminated, external calls are reduced, and fast-path branches short-circuit unnecessary logic to reduce overall runtime and memory allocations. Comments are preserved unless the associated code was optimized.

codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Jul 17, 2025

codeflash-ai bot mentioned this pull request Jul 17, 2025

Refinement #555

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up function `speedup_critic` by 15% in PR #555 (`refinement`) #557

⚡️ Speed up function `speedup_critic` by 15% in PR #555 (`refinement`) #557

Uh oh!

codeflash-ai bot commented Jul 17, 2025

Uh oh!

Uh oh!

⚡️ Speed up function speedup_critic by 15% in PR #555 (refinement) #557

Are you sure you want to change the base?

⚡️ Speed up function speedup_critic by 15% in PR #555 (refinement) #557

Uh oh!

Conversation

codeflash-ai bot commented Jul 17, 2025

⚡️ This pull request contains optimizations for PR #555

📄 15% (0.15x) speedup for speedup_critic in codeflash/result/critic.py

📝 Explanation and details

Uh oh!

Uh oh!

⚡️ Speed up function `speedup_critic` by 15% in PR #555 (`refinement`) #557

⚡️ Speed up function `speedup_critic` by 15% in PR #555 (`refinement`) #557

📄 15% (0.15x) speedup for `speedup_critic` in `codeflash/result/critic.py`