⚡️ Speed up function `_signature` by 23% #124

codeflash-ai · 2025-11-12T16:20:48Z

📄 23% (0.23x) speedup for `_signature` in `src/bokeh/util/token.py`

⏱️ Runtime : 4.82 milliseconds → 3.91 milliseconds (best of 125 runs)

📝 Explanation and details

The optimization achieves a 23% speedup by replacing higher-level codecs module functions with direct string/bytes method calls, eliminating unnecessary function lookups and indirection.

Key optimizations applied:

Direct string encoding: Replaced codecs.encode(secret_key, 'utf-8') with secret_key.encode('utf-8') in _ensure_bytes(). This eliminates the overhead of looking up the codecs.encode function and its internal dispatch logic.
Direct bytes decoding: Replaced codecs.decode(base64.urlsafe_b64encode(...), 'ascii') with base64.urlsafe_b64encode(...).decode('ascii') in _base64_encode(). This removes an extra layer of function indirection.
Streamlined type handling: In _base64_encode(), the input conversion is now a single inline conditional expression instead of calling _ensure_bytes(), reducing function call overhead.
Consistent direct encoding: In _signature(), replaced codecs.encode(base_id, "utf-8") with base_id.encode('utf-8') for consistency.

Why this matters for performance:

The codecs module functions are generic and handle many encoding types, adding dispatch overhead
Direct method calls on strings/bytes objects are faster as they bypass this generic layer
These functions are called in hot paths - session ID generation, JWT token creation, and signature validation happen frequently in web applications

Impact on workloads:
Based on the function references, _signature() is called during:

Session ID generation for each browser tab connection
JWT token creation and validation
Session signature verification on every request

The test results show 16-48% improvements across various input types, with the optimization being particularly effective for ASCII strings (48% faster) and moderate improvements for Unicode/large inputs (30-40% faster). This makes the optimization valuable for typical web application workloads where session management happens frequently.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	✅ 22 Passed
🌀 Generated Regression Tests	✅ 1633 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	✅ 1 Passed
📊 Tests Coverage	100.0%

⚙️ Existing Unit Tests and Runtime

Test File::Test Function	Original ⏱️	Optimized ⏱️	Speedup
`unit/bokeh/util/test_token.py::TestSessionId.test_signature`	16.4μs	14.1μs	16.6%✅

🌀 Generated Regression Tests and Runtime

import base64
import codecs
import hashlib
import hmac

# imports
import pytest  # used for our unit tests
from bokeh.util.token import _signature

# unit tests

# ---------------------------
# Basic Test Cases
# ---------------------------




def test_signature_is_deterministic():
    # The same inputs always produce the same output
    base_id = "abc123"
    secret_key = "key"
    codeflash_output = _signature(base_id, secret_key); sig1 = codeflash_output # 18.8μs -> 15.6μs (20.3% faster)
    codeflash_output = _signature(base_id, secret_key); sig2 = codeflash_output # 4.38μs -> 3.62μs (21.1% faster)

def test_signature_differs_with_different_base_id():
    # Different base_id yields different signature
    base_id1 = "id1"
    base_id2 = "id2"
    secret_key = "key"
    codeflash_output = _signature(base_id1, secret_key); sig1 = codeflash_output # 9.44μs -> 8.08μs (16.9% faster)
    codeflash_output = _signature(base_id2, secret_key); sig2 = codeflash_output # 3.81μs -> 3.27μs (16.5% faster)

def test_signature_differs_with_different_secret_key():
    # Different secret_key yields different signature
    base_id = "id"
    secret_key1 = "key1"
    secret_key2 = "key2"
    codeflash_output = _signature(base_id, secret_key1); sig1 = codeflash_output # 8.82μs -> 7.53μs (17.2% faster)
    codeflash_output = _signature(base_id, secret_key2); sig2 = codeflash_output # 3.69μs -> 3.10μs (19.2% faster)

# ---------------------------
# Edge Test Cases
# ---------------------------




def test_signature_with_none_secret_key_raises():
    # secret_key=None should raise AssertionError
    base_id = "id"
    with pytest.raises(AssertionError):
        _signature(base_id, None) # 2.98μs -> 1.40μs (113% faster)







def test_signature_performance_many_calls():
    # Run 1000 signatures to check performance and determinism
    base_id = "id"
    secret_key = "key"
    sigs = set()
    for i in range(1000):
        codeflash_output = _signature(base_id + str(i), secret_key); sig = codeflash_output # 2.85ms -> 2.32ms (22.9% faster)
        sigs.add(sig)


def test_signature_padding_removed():
    # The output should never have '=' padding at the end
    base_id = "padtest"
    secret_key = "padkey"
    codeflash_output = _signature(base_id, secret_key); sig = codeflash_output # 18.6μs -> 15.5μs (20.0% faster)

def test_signature_output_is_ascii():
    # The signature should only contain URL-safe base64 characters
    base_id = "ascii"
    secret_key = "ascii"
    codeflash_output = _signature(base_id, secret_key); sig = codeflash_output # 10.5μs -> 8.49μs (23.8% faster)
    for c in sig:
        pass

def test_signature_changes_with_case():
    # Changing the case of base_id or secret_key changes the signature
    base_id = "CaseTest"
    secret_key = "Secret"
    codeflash_output = _signature(base_id, secret_key); sig1 = codeflash_output # 9.34μs -> 8.43μs (10.9% faster)
    codeflash_output = _signature(base_id.lower(), secret_key); sig2 = codeflash_output # 4.03μs -> 3.37μs (19.6% faster)
    codeflash_output = _signature(base_id, secret_key.lower()); sig3 = codeflash_output # 2.97μs -> 2.55μs (16.8% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

import base64
import codecs
import hashlib
import hmac

# imports
import pytest
from bokeh.util.token import _signature

# unit tests

# --- Basic Test Cases ---

def test_signature_basic_ascii():
    # Basic test with ascii string and ascii key
    base_id = "test"
    secret_key = "key"
    # Compute expected signature manually
    expected = base64.urlsafe_b64encode(
        hmac.new(b"key", b"test", hashlib.sha256).digest()
    ).decode("ascii").rstrip("=")
    codeflash_output = _signature(base_id, secret_key) # 7.13μs -> 4.82μs (48.0% faster)

def test_signature_bytes_key():
    # Test with bytes secret key
    base_id = "hello"
    secret_key = b"mysecret"
    expected = base64.urlsafe_b64encode(
        hmac.new(b"mysecret", b"hello", hashlib.sha256).digest()
    ).decode("ascii").rstrip("=")
    codeflash_output = _signature(base_id, secret_key) # 5.83μs -> 4.20μs (39.0% faster)

def test_signature_unicode_base_id():
    # Test with unicode base_id
    base_id = "héllo世界"
    secret_key = "unicodekey"
    expected = base64.urlsafe_b64encode(
        hmac.new(b"unicodekey", base_id.encode("utf-8"), hashlib.sha256).digest()
    ).decode("ascii").rstrip("=")
    codeflash_output = _signature(base_id, secret_key) # 5.98μs -> 4.31μs (38.6% faster)

def test_signature_empty_base_id():
    # Test with empty base_id
    base_id = ""
    secret_key = "empty"
    expected = base64.urlsafe_b64encode(
        hmac.new(b"empty", b"", hashlib.sha256).digest()
    ).decode("ascii").rstrip("=")
    codeflash_output = _signature(base_id, secret_key) # 5.71μs -> 4.16μs (37.3% faster)

def test_signature_empty_secret_key():
    # Test with empty secret_key
    base_id = "something"
    secret_key = ""
    expected = base64.urlsafe_b64encode(
        hmac.new(b"", b"something", hashlib.sha256).digest()
    ).decode("ascii").rstrip("=")
    codeflash_output = _signature(base_id, secret_key) # 5.85μs -> 4.09μs (43.1% faster)

# --- Edge Test Cases ---

def test_signature_secret_key_none():
    # Should raise AssertionError if secret_key is None
    with pytest.raises(AssertionError):
        _signature("data", None) # 1.67μs -> 1.09μs (53.1% faster)

def test_signature_base_id_non_ascii_bytes_key():
    # base_id is unicode, key is bytes with non-ascii values
    base_id = "emoji😊"
    secret_key = b"\xff\xfe\xfd"
    expected = base64.urlsafe_b64encode(
        hmac.new(b"\xff\xfe\xfd", base_id.encode("utf-8"), hashlib.sha256).digest()
    ).decode("ascii").rstrip("=")
    codeflash_output = _signature(base_id, secret_key) # 5.95μs -> 4.22μs (41.0% faster)

def test_signature_long_secret_key():
    # Very long secret key
    base_id = "short"
    secret_key = "a" * 512
    expected = base64.urlsafe_b64encode(
        hmac.new(b"a" * 512, b"short", hashlib.sha256).digest()
    ).decode("ascii").rstrip("=")
    codeflash_output = _signature(base_id, secret_key) # 6.64μs -> 4.81μs (37.9% faster)

def test_signature_long_base_id():
    # Very long base_id
    base_id = "b" * 512
    secret_key = "key"
    expected = base64.urlsafe_b64encode(
        hmac.new(b"key", b"b" * 512, hashlib.sha256).digest()
    ).decode("ascii").rstrip("=")
    codeflash_output = _signature(base_id, secret_key) # 6.25μs -> 4.47μs (39.8% faster)

def test_signature_secret_key_bytes_vs_str_equivalence():
    # str and bytes keys with same value should produce same result
    base_id = "abc"
    key_str = "samekey"
    key_bytes = b"samekey"
    codeflash_output = _signature(base_id, key_str) # 8.28μs -> 6.85μs (20.8% faster)

def test_signature_base_id_with_null_bytes():
    # base_id contains null bytes
    base_id = "abc\x00def"
    secret_key = "key"
    expected = base64.urlsafe_b64encode(
        hmac.new(b"key", b"abc\x00def", hashlib.sha256).digest()
    ).decode("ascii").rstrip("=")
    codeflash_output = _signature(base_id, secret_key) # 5.81μs -> 4.14μs (40.2% faster)

def test_signature_secret_key_with_null_bytes():
    # secret_key contains null bytes
    base_id = "test"
    secret_key = b"key\x00withnull"
    expected = base64.urlsafe_b64encode(
        hmac.new(b"key\x00withnull", b"test", hashlib.sha256).digest()
    ).decode("ascii").rstrip("=")
    codeflash_output = _signature(base_id, secret_key) # 5.40μs -> 3.99μs (35.3% faster)

# --- Large Scale Test Cases ---

def test_signature_large_base_id_and_key():
    # Large base_id and key (up to 1000 chars)
    base_id = "x" * 1000
    secret_key = "y" * 1000
    expected = base64.urlsafe_b64encode(
        hmac.new(b"y" * 1000, b"x" * 1000, hashlib.sha256).digest()
    ).decode("ascii").rstrip("=")
    codeflash_output = _signature(base_id, secret_key) # 7.49μs -> 5.75μs (30.2% faster)

def test_signature_many_unique_inputs():
    # Test many unique combinations to check determinism and uniqueness
    key = "fixedkey"
    results = set()
    for i in range(100):
        base_id = f"id_{i}"
        codeflash_output = _signature(base_id, key); sig = codeflash_output # 292μs -> 238μs (22.8% faster)
        results.add(sig)

def test_signature_performance_large_batch():
    # Test that function can handle a batch of 500 signatures quickly
    key = "batchkey"
    for i in range(500):
        base_id = "item" + str(i)
        codeflash_output = _signature(base_id, key); sig = codeflash_output # 1.42ms -> 1.15ms (23.5% faster)

def test_signature_output_length():
    # The output length should be 43 or 44 chars (base64url of 32 bytes, unpadded)
    base_id = "sample"
    secret_key = "lenkey"
    codeflash_output = _signature(base_id, secret_key); sig = codeflash_output # 11.6μs -> 9.84μs (17.5% faster)

def test_signature_output_charset():
    # Output should only contain base64url-safe chars
    base_id = "check"
    secret_key = "safe"
    codeflash_output = _signature(base_id, secret_key); sig = codeflash_output # 8.47μs -> 7.26μs (16.6% faster)
    allowed = set("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_")

# --- Determinism Test Case ---

def test_signature_determinism():
    # The same input always produces the same output
    base_id = "repeat"
    secret_key = "repeatkey"
    codeflash_output = _signature(base_id, secret_key); sig1 = codeflash_output # 8.38μs -> 7.41μs (13.0% faster)
    codeflash_output = _signature(base_id, secret_key); sig2 = codeflash_output # 3.78μs -> 3.09μs (22.3% faster)

# --- Mutation Testing Catchers ---

def test_signature_changes_with_base_id():
    # Changing base_id should change the signature
    key = "mutkey"
    codeflash_output = _signature("foo", key); sig1 = codeflash_output # 8.34μs -> 7.10μs (17.4% faster)
    codeflash_output = _signature("bar", key); sig2 = codeflash_output # 3.95μs -> 3.40μs (16.2% faster)

def test_signature_changes_with_secret_key():
    # Changing secret_key should change the signature
    base_id = "foobar"
    codeflash_output = _signature(base_id, "key1"); sig1 = codeflash_output # 8.18μs -> 7.07μs (15.7% faster)
    codeflash_output = _signature(base_id, "key2"); sig2 = codeflash_output # 3.86μs -> 3.21μs (20.1% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

from bokeh.util.token import _signature
import pytest

def test__signature():
    with pytest.raises(TypeError, match="a\\ bytes\\-like\\ object\\ is\\ required,\\ not\\ 'SymbolicBytes'"):
        _signature('', b'')

def test__signature_2():
    with pytest.raises(AssertionError):
        _signature('', None)

🔎 Concolic Coverage Tests and Runtime

Test File::Test Function	Original ⏱️	Optimized ⏱️	Speedup
`codeflash_concolic_sstvtaha/tmpoxdl72nn/test_concolic_coverage.py::test__signature_2`	2.97μs	1.37μs	117%✅

To edit these changes git checkout codeflash/optimize-_signature-mhw7idzv and push.

The optimization achieves a **23% speedup** by replacing higher-level `codecs` module functions with direct string/bytes method calls, eliminating unnecessary function lookups and indirection. **Key optimizations applied:** 1. **Direct string encoding**: Replaced `codecs.encode(secret_key, 'utf-8')` with `secret_key.encode('utf-8')` in `_ensure_bytes()`. This eliminates the overhead of looking up the `codecs.encode` function and its internal dispatch logic. 2. **Direct bytes decoding**: Replaced `codecs.decode(base64.urlsafe_b64encode(...), 'ascii')` with `base64.urlsafe_b64encode(...).decode('ascii')` in `_base64_encode()`. This removes an extra layer of function indirection. 3. **Streamlined type handling**: In `_base64_encode()`, the input conversion is now a single inline conditional expression instead of calling `_ensure_bytes()`, reducing function call overhead. 4. **Consistent direct encoding**: In `_signature()`, replaced `codecs.encode(base_id, "utf-8")` with `base_id.encode('utf-8')` for consistency. **Why this matters for performance:** - The `codecs` module functions are generic and handle many encoding types, adding dispatch overhead - Direct method calls on strings/bytes objects are faster as they bypass this generic layer - These functions are called in **hot paths** - session ID generation, JWT token creation, and signature validation happen frequently in web applications **Impact on workloads:** Based on the function references, `_signature()` is called during: - Session ID generation for each browser tab connection - JWT token creation and validation - Session signature verification on every request The test results show **16-48% improvements** across various input types, with the optimization being particularly effective for ASCII strings (48% faster) and moderate improvements for Unicode/large inputs (30-40% faster). This makes the optimization valuable for typical web application workloads where session management happens frequently.

codeflash-ai bot requested a review from mashraf-222 November 12, 2025 16:20

codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up function `_signature` by 23% #124

⚡️ Speed up function `_signature` by 23% #124

Uh oh!

codeflash-ai bot commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up function _signature by 23% #124

Are you sure you want to change the base?

⚡️ Speed up function _signature by 23% #124

Uh oh!

Conversation

codeflash-ai bot commented Nov 12, 2025

📄 23% (0.23x) speedup for _signature in src/bokeh/util/token.py

📝 Explanation and details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up function `_signature` by 23% #124

⚡️ Speed up function `_signature` by 23% #124

📄 23% (0.23x) speedup for `_signature` in `src/bokeh/util/token.py`