Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 12, 2025

📄 52% (0.52x) speedup for _use_gl in src/bokeh/embed/bundle.py

⏱️ Runtime : 988 microseconds 650 microseconds (best of 50 runs)

📝 Explanation and details

The optimization achieves a 52% speedup by eliminating Python function call overhead and generator inefficiencies through three key changes:

1. Replaced any() + generator with explicit loop

  • The original any(query(x) for x in objs) creates a generator and delegates to Python's built-in any(), introducing overhead
  • The optimized version uses a direct for loop with early return (return True on first match), eliminating generator creation and function call indirection

2. Inlined the lambda function in _use_gl

  • Original code called _any() with a lambda that performed isinstance() and attribute access
  • Optimized version inlines this logic directly in the loop, removing lambda creation overhead and the extra function call to _any()

3. Pre-computed local variables

  • Cached Plot type and "webgl" string as local variables to avoid repeated lookups during iteration

Performance Impact Analysis:
Based on the function reference, _use_gl() is called from bundle_for_objs_and_resources(), which appears to be in the critical path for rendering Bokeh visualizations. The 52% speedup is particularly valuable for:

  • Large collections: Test results show 60-70% improvements for 500+ objects, indicating the optimization scales well
  • Early termination scenarios: When WebGL plots are found quickly, the explicit loop with early return provides maximum benefit
  • Mixed object types: The inlined isinstance() check avoids function call overhead for each object

The optimization maintains identical behavior while being especially effective for larger datasets where the function call and generator overhead becomes more pronounced.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 26 Passed
🌀 Generated Regression Tests 50 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
⚙️ Existing Unit Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
unit/bokeh/embed/test_bundle.py::Test__use_gl.test_with_gl 25.5μs 19.0μs 33.9%✅
unit/bokeh/embed/test_bundle.py::Test__use_gl.test_without_gl 100μs 87.9μs 14.0%✅
🌀 Generated Regression Tests and Runtime
from __future__ import annotations

from typing import Callable

# imports
import pytest  # used for our unit tests
from bokeh.embed.bundle import _use_gl


# Minimal stub for HasProps, as used in Bokeh
class HasProps:
    pass

# Minimal stub for Plot, as used in Bokeh
class Plot(HasProps):
    def __init__(self, output_backend=None):
        self.output_backend = output_backend
from bokeh.embed.bundle import _use_gl

# unit tests

# ---------------------------
# Basic Test Cases
# ---------------------------

def test_empty_set_returns_false():
    # Test with an empty set; should return False
    codeflash_output = _use_gl(set()) # 4.41μs -> 3.61μs (22.2% faster)

def test_set_with_non_plot_objects_returns_false():
    # Test with a set containing only non-Plot HasProps objects
    class Dummy(HasProps):
        pass
    objs = {Dummy(), Dummy()}
    codeflash_output = _use_gl(objs) # 5.02μs -> 3.72μs (35.2% faster)

def test_set_with_plot_not_webgl_returns_false():
    # Test with Plot objects whose output_backend is not "webgl"
    objs = {Plot(output_backend="canvas"), Plot(output_backend="svg")}
    codeflash_output = _use_gl(objs) # 4.44μs -> 3.52μs (26.2% faster)

def test_set_with_one_plot_webgl_returns_true():
    # Test with one Plot object whose output_backend is "webgl"
    objs = {Plot(output_backend="webgl")}
    codeflash_output = _use_gl(objs) # 4.14μs -> 3.15μs (31.5% faster)

def test_set_with_multiple_plot_one_webgl_returns_true():
    # Test with several Plot objects, one of which requests "webgl"
    objs = {Plot(output_backend="canvas"), Plot(output_backend="webgl"), Plot(output_backend="svg")}
    codeflash_output = _use_gl(objs) # 4.37μs -> 3.40μs (28.5% faster)

def test_set_with_plot_webgl_and_non_plot_returns_true():
    # Test with a mix of Plot (with webgl) and non-Plot HasProps objects
    class Dummy(HasProps):
        pass
    objs = {Plot(output_backend="webgl"), Dummy()}
    codeflash_output = _use_gl(objs) # 4.57μs -> 3.43μs (33.0% faster)

def test_set_with_plot_webgl_and_non_plot_and_other_plot_returns_true():
    # Test with a mix of Plot (with webgl), Plot (not webgl), and non-Plot
    class Dummy(HasProps):
        pass
    objs = {Plot(output_backend="canvas"), Plot(output_backend="webgl"), Dummy()}
    codeflash_output = _use_gl(objs) # 4.59μs -> 3.55μs (29.2% faster)

# ---------------------------
# Edge Test Cases
# ---------------------------

def test_plot_output_backend_none_returns_false():
    # Plot with output_backend=None should not trigger webgl
    objs = {Plot(output_backend=None)}
    codeflash_output = _use_gl(objs) # 4.13μs -> 3.04μs (36.0% faster)

def test_plot_output_backend_empty_string_returns_false():
    # Plot with output_backend="" should not trigger webgl
    objs = {Plot(output_backend="")}
    codeflash_output = _use_gl(objs) # 4.04μs -> 2.95μs (37.0% faster)

def test_plot_output_backend_case_sensitive():
    # Plot with output_backend="WebGL" (different case) should not trigger webgl
    objs = {Plot(output_backend="WebGL")}
    codeflash_output = _use_gl(objs) # 4.03μs -> 2.92μs (37.6% faster)

def test_plot_output_backend_extra_spaces():
    # Plot with output_backend=" webgl " (spaces) should not trigger webgl
    objs = {Plot(output_backend=" webgl ")}
    codeflash_output = _use_gl(objs) # 4.02μs -> 2.85μs (40.9% faster)

def test_set_with_duplicate_plot_webgl_objects():
    # Test with duplicate Plot objects requesting webgl
    plot = Plot(output_backend="webgl")
    objs = {plot, plot}
    codeflash_output = _use_gl(objs) # 3.88μs -> 2.94μs (32.1% faster)

def test_set_with_subclass_of_plot_webgl():
    # Test with a subclass of Plot whose output_backend is "webgl"
    class MyPlot(Plot):
        pass
    objs = {MyPlot(output_backend="webgl")}
    codeflash_output = _use_gl(objs) # 4.29μs -> 3.31μs (29.7% faster)

def test_set_with_plot_missing_output_backend_attribute():
    # Plot object missing output_backend attribute should not trigger webgl
    class BrokenPlot(HasProps):
        pass
    objs = {BrokenPlot()}
    # Should not raise, should return False
    codeflash_output = _use_gl(objs) # 4.25μs -> 3.24μs (31.2% faster)

def test_set_with_plot_output_backend_int():
    # Plot object with output_backend as integer should not trigger webgl
    objs = {Plot(output_backend=123)}
    codeflash_output = _use_gl(objs) # 3.95μs -> 3.04μs (30.2% faster)

def test_set_with_plot_output_backend_list():
    # Plot object with output_backend as list should not trigger webgl
    objs = {Plot(output_backend=["webgl"])}
    codeflash_output = _use_gl(objs) # 3.88μs -> 2.94μs (31.7% faster)

def test_set_with_plot_output_backend_webgl_in_middle():
    # Plot object with output_backend containing 'webgl' as substring should not trigger webgl
    objs = {Plot(output_backend="canvas_webgl_svg")}
    codeflash_output = _use_gl(objs) # 3.85μs -> 2.93μs (31.2% faster)

# ---------------------------
# Large Scale Test Cases
# ---------------------------

def test_large_set_all_non_plot_returns_false():
    # Large set of non-Plot HasProps objects
    class Dummy(HasProps):
        pass
    objs = {Dummy() for _ in range(500)}
    codeflash_output = _use_gl(objs) # 45.6μs -> 28.2μs (61.8% faster)

def test_large_set_all_plot_non_webgl_returns_false():
    # Large set of Plot objects with output_backend not "webgl"
    objs = {Plot(output_backend="canvas") for _ in range(500)}
    codeflash_output = _use_gl(objs) # 47.9μs -> 29.0μs (65.0% faster)

def test_large_set_one_plot_webgl_returns_true():
    # Large set with one Plot object requesting webgl
    class Dummy(HasProps):
        pass
    objs = {Plot(output_backend="canvas") for _ in range(499)}
    objs.add(Plot(output_backend="webgl"))
    codeflash_output = _use_gl(objs) # 47.5μs -> 28.9μs (64.4% faster)

def test_large_set_many_plot_webgl_returns_true():
    # Large set with many Plot objects requesting webgl
    objs = {Plot(output_backend="webgl") for _ in range(250)}
    objs.update({Plot(output_backend="canvas") for _ in range(250)})
    codeflash_output = _use_gl(objs) # 46.8μs -> 28.2μs (66.0% faster)

def test_large_set_mixed_types_returns_true():
    # Large set with a mix of Plot (webgl), Plot (not webgl), and non-Plot objects
    class Dummy(HasProps):
        pass
    objs = {Plot(output_backend="webgl") for _ in range(100)}
    objs.update({Plot(output_backend="canvas") for _ in range(100)})
    objs.update({Dummy() for _ in range(100)})
    codeflash_output = _use_gl(objs) # 30.4μs -> 18.6μs (63.3% faster)

def test_large_set_mixed_types_returns_false():
    # Large set with a mix of Plot (not webgl), and non-Plot objects, but no webgl
    class Dummy(HasProps):
        pass
    objs = {Plot(output_backend="canvas") for _ in range(400)}
    objs.update({Dummy() for _ in range(100)})
    codeflash_output = _use_gl(objs) # 47.8μs -> 29.2μs (63.7% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from typing import Callable, Set

# imports
import pytest
from bokeh.embed.bundle import _use_gl


# Minimal HasProps base class for testing
class HasProps:
    pass

# Minimal Plot class for testing
class Plot(HasProps):
    def __init__(self, output_backend=None):
        self.output_backend = output_backend
from bokeh.embed.bundle import _use_gl

# unit tests

# -------------------------------
# Basic Test Cases
# -------------------------------

def test_empty_set_returns_false():
    # Test with an empty set: should return False
    codeflash_output = _use_gl(set()) # 4.55μs -> 3.58μs (27.1% faster)

def test_single_plot_with_webgl():
    # Test with a single Plot with output_backend "webgl"
    objs = {Plot(output_backend="webgl")}
    codeflash_output = _use_gl(objs) # 4.66μs -> 3.70μs (25.9% faster)

def test_single_plot_with_non_webgl():
    # Test with a single Plot with output_backend not "webgl"
    objs = {Plot(output_backend="canvas")}
    codeflash_output = _use_gl(objs) # 4.43μs -> 3.33μs (32.9% faster)

def test_single_non_plot_object():
    # Test with a single object that's not a Plot
    class Dummy(HasProps):
        pass
    objs = {Dummy()}
    codeflash_output = _use_gl(objs) # 4.37μs -> 3.43μs (27.3% faster)

def test_mixed_objects_one_webgl():
    # Test with multiple objects, one Plot with "webgl"
    class Dummy(HasProps):
        pass
    objs = {Dummy(), Plot(output_backend="canvas"), Plot(output_backend="webgl")}
    codeflash_output = _use_gl(objs) # 4.98μs -> 3.57μs (39.5% faster)

def test_mixed_objects_no_webgl():
    # Test with multiple objects, no Plot with "webgl"
    class Dummy(HasProps):
        pass
    objs = {Dummy(), Plot(output_backend="canvas"), Plot(output_backend="svg")}
    codeflash_output = _use_gl(objs) # 4.84μs -> 3.42μs (41.6% faster)

# -------------------------------
# Edge Test Cases
# -------------------------------

def test_plot_with_output_backend_none():
    # Plot with output_backend set to None
    objs = {Plot(output_backend=None)}
    codeflash_output = _use_gl(objs) # 4.28μs -> 3.21μs (33.4% faster)

def test_plot_with_output_backend_empty_string():
    # Plot with output_backend as empty string
    objs = {Plot(output_backend="")}
    codeflash_output = _use_gl(objs) # 4.15μs -> 3.12μs (33.3% faster)

def test_plot_with_output_backend_case_sensitivity():
    # Plot with output_backend "WebGL" (case-sensitive)
    objs = {Plot(output_backend="WebGL")}
    codeflash_output = _use_gl(objs) # 4.03μs -> 3.09μs (30.6% faster)

def test_non_plot_with_output_backend_webgl():
    # Non-Plot object with output_backend "webgl"
    class Dummy(HasProps):
        def __init__(self):
            self.output_backend = "webgl"
    objs = {Dummy()}
    codeflash_output = _use_gl(objs) # 4.32μs -> 3.13μs (38.3% faster)

def test_plot_with_output_backend_webgl_among_many_types():
    # Many objects, only one Plot with "webgl"
    class Dummy1(HasProps): pass
    class Dummy2(HasProps): pass
    objs = {Dummy1(), Dummy2(), Plot(output_backend="svg"), Plot(output_backend="webgl")}
    codeflash_output = _use_gl(objs) # 4.99μs -> 3.82μs (30.6% faster)

def test_plot_with_output_backend_as_int():
    # Plot with output_backend as integer
    objs = {Plot(output_backend=123)}
    codeflash_output = _use_gl(objs) # 4.08μs -> 3.14μs (29.7% faster)

def test_plot_with_output_backend_as_none_and_str():
    # Multiple Plots, one with None, one with "webgl"
    objs = {Plot(output_backend=None), Plot(output_backend="webgl")}
    codeflash_output = _use_gl(objs) # 4.17μs -> 3.21μs (29.6% faster)

def test_plot_with_output_backend_as_boolean():
    # Plot with output_backend as boolean
    objs = {Plot(output_backend=True)}
    codeflash_output = _use_gl(objs) # 3.80μs -> 2.92μs (30.2% faster)

def test_plot_with_output_backend_as_list():
    # Plot with output_backend as a list
    objs = {Plot(output_backend=["webgl"])}
    codeflash_output = _use_gl(objs) # 4.01μs -> 2.88μs (39.0% faster)

def test_plot_with_output_backend_as_dict():
    # Plot with output_backend as a dict
    objs = {Plot(output_backend={"type": "webgl"})}
    codeflash_output = _use_gl(objs) # 3.88μs -> 2.92μs (33.1% faster)

def test_non_plot_no_output_backend_attribute():
    # Non-Plot object without output_backend attribute
    class Dummy(HasProps):
        pass
    objs = {Dummy()}
    codeflash_output = _use_gl(objs) # 4.34μs -> 3.25μs (33.7% faster)

def test_plot_with_output_backend_webgl_and_other_plots():
    # Multiple Plots, some with "webgl", some not
    objs = {Plot(output_backend="canvas"), Plot(output_backend="webgl"), Plot(output_backend="svg")}
    codeflash_output = _use_gl(objs) # 4.41μs -> 3.27μs (34.9% faster)

# -------------------------------
# Large Scale Test Cases
# -------------------------------

def test_large_set_all_non_webgl():
    # Large set of Plots, none with "webgl"
    objs = {Plot(output_backend="canvas") for _ in range(500)}
    codeflash_output = _use_gl(objs) # 47.2μs -> 28.5μs (65.7% faster)

def test_large_set_one_webgl_first():
    # Large set, first Plot has "webgl"
    objs = {Plot(output_backend="webgl")}
    objs.update({Plot(output_backend="canvas") for _ in range(499)})
    codeflash_output = _use_gl(objs) # 46.3μs -> 27.8μs (66.4% faster)

def test_large_set_one_webgl_last():
    # Large set, last Plot has "webgl"
    objs = {Plot(output_backend="canvas") for _ in range(499)}
    objs.add(Plot(output_backend="webgl"))
    codeflash_output = _use_gl(objs) # 46.3μs -> 28.3μs (63.7% faster)

def test_large_set_mixed_types_and_webgl():
    # Large set, many types, only one Plot with "webgl"
    class Dummy(HasProps): pass
    objs = {Plot(output_backend="canvas") for _ in range(400)}
    objs.update({Dummy() for _ in range(400)})
    objs.add(Plot(output_backend="webgl"))
    codeflash_output = _use_gl(objs) # 70.7μs -> 41.6μs (69.9% faster)

def test_large_set_no_plots():
    # Large set of non-Plot objects
    class Dummy(HasProps): pass
    objs = {Dummy() for _ in range(800)}
    codeflash_output = _use_gl(objs) # 69.7μs -> 41.0μs (69.7% faster)

def test_large_set_all_webgl():
    # Large set of Plots, all with "webgl"
    objs = {Plot(output_backend="webgl") for _ in range(600)}
    codeflash_output = _use_gl(objs) # 54.1μs -> 31.7μs (70.5% faster)

def test_large_set_half_webgl_half_non_webgl():
    # Large set, half "webgl", half not
    objs = {Plot(output_backend="webgl") for _ in range(300)}
    objs.update({Plot(output_backend="canvas") for _ in range(300)})
    codeflash_output = _use_gl(objs) # 53.2μs -> 31.9μs (66.4% faster)

def test_large_set_with_duplicates():
    # Large set with duplicate Plot objects (should not affect result)
    p_webgl = Plot(output_backend="webgl")
    p_canvas = Plot(output_backend="canvas")
    objs = {p_webgl, p_canvas}
    objs.update({p_canvas for _ in range(500)})
    codeflash_output = _use_gl(objs) # 4.75μs -> 3.48μs (36.7% faster)

def test_large_set_with_varied_output_backend_types():
    # Large set with various output_backend types, only one correct
    objs = {Plot(output_backend="canvas") for _ in range(200)}
    objs.update({Plot(output_backend=None) for _ in range(200)})
    objs.update({Plot(output_backend=123) for _ in range(200)})
    objs.add(Plot(output_backend="webgl"))
    codeflash_output = _use_gl(objs) # 54.4μs -> 32.9μs (65.4% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-_use_gl-mhw5x97a and push.

Codeflash Static Badge

The optimization achieves a **52% speedup** by eliminating Python function call overhead and generator inefficiencies through three key changes:

**1. Replaced `any()` + generator with explicit loop**
- The original `any(query(x) for x in objs)` creates a generator and delegates to Python's built-in `any()`, introducing overhead
- The optimized version uses a direct `for` loop with early return (`return True` on first match), eliminating generator creation and function call indirection

**2. Inlined the lambda function in `_use_gl`**
- Original code called `_any()` with a lambda that performed `isinstance()` and attribute access
- Optimized version inlines this logic directly in the loop, removing lambda creation overhead and the extra function call to `_any()`

**3. Pre-computed local variables**
- Cached `Plot` type and `"webgl"` string as local variables to avoid repeated lookups during iteration

**Performance Impact Analysis:**
Based on the function reference, `_use_gl()` is called from `bundle_for_objs_and_resources()`, which appears to be in the critical path for rendering Bokeh visualizations. The 52% speedup is particularly valuable for:

- **Large collections**: Test results show 60-70% improvements for 500+ objects, indicating the optimization scales well
- **Early termination scenarios**: When WebGL plots are found quickly, the explicit loop with early return provides maximum benefit
- **Mixed object types**: The inlined `isinstance()` check avoids function call overhead for each object

The optimization maintains identical behavior while being especially effective for larger datasets where the function call and generator overhead becomes more pronounced.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 12, 2025 15:36
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant