Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 12, 2025

📄 33% (0.33x) speedup for check_session_id_signature in src/bokeh/util/token.py

⏱️ Runtime : 470 microseconds 354 microseconds (best of 50 runs)

📝 Explanation and details

The optimized code achieves a 32% speedup through three key optimizations that reduce function call overhead and unnecessary work:

1. Conditional Secret Key Processing
The original code always called _ensure_bytes(secret_key) at the function start, even for unsigned sessions (the majority case). The optimized version only processes the secret key when signed=True, eliminating this overhead for ~83% of calls based on the test patterns.

2. String Encoding Optimization
Replaced codecs.encode(string, 'utf-8') with string.encode('utf-8') in both _ensure_bytes and _signature. The .encode() method is a direct C-level operation that's significantly faster than the more generic codecs.encode() function call.

3. Inlined Base64 Encoding
The original code called _base64_encode(signer.digest()) which had additional overhead from function calls and used codecs.decode(). The optimized version inlines this logic using direct base64.urlsafe_b64encode() and .decode('ascii'), eliminating the extra function call and codec overhead.

Performance Impact by Use Case:

  • Unsigned sessions (most common): 50-90% faster due to eliminating unnecessary secret key processing
  • Invalid signed sessions: 15-40% faster from encoding optimizations
  • Valid signed sessions: Moderate gains from string encoding and base64 inlining

Hot Path Context:
Given that check_session_id_signature is called from check_token_signature (which validates both tokens and their contained session IDs), this optimization directly benefits Bokeh server authentication workflows where session validation occurs frequently during client-server interactions.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 1522 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 2 Passed
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import random
import string

# imports
import pytest
from bokeh.util.token import check_session_id_signature

# function to test
# (Function code is provided above and assumed present here.)

# Helper to generate a valid session_id and its signature
def make_signed_session_id(base_id: str, secret_key: bytes) -> str:
    base_id_encoded = base_id.encode('utf-8')
    signer = hmac.new(secret_key, base_id_encoded, hashlib.sha256)
    signature = base64.urlsafe_b64encode(signer.digest()).decode('ascii').rstrip('=')
    return f"{base_id}.{signature}"

# Helper to generate a random string
def random_string(length: int) -> str:
    return ''.join(random.choices(string.ascii_letters + string.digits, k=length))

# Basic Test Cases



def test_unsigned_session_id_returns_true():
    """Test that unsigned session id always returns True."""
    secret_key = b'mysecret'
    session_id = "session123"
    codeflash_output = check_session_id_signature(session_id, secret_key, signed=False) # 1.19μs -> 777ns (52.8% faster)


def test_session_id_with_no_dot_returns_false():
    """Test that session id without a dot returns False when signed=True."""
    secret_key = b'mysecret'
    session_id = "session123"  # no dot
    codeflash_output = check_session_id_signature(session_id, secret_key, signed=True) # 1.71μs -> 1.24μs (37.5% faster)


def test_empty_session_id_signed():
    """Test that empty session_id returns False if signed=True."""
    secret_key = b'mysecret'
    codeflash_output = check_session_id_signature("", secret_key, signed=True) # 1.68μs -> 1.28μs (31.0% faster)

def test_empty_session_id_unsigned():
    """Test that empty session_id returns True if signed=False."""
    secret_key = b'mysecret'
    codeflash_output = check_session_id_signature("", secret_key, signed=False) # 962ns -> 507ns (89.7% faster)


def test_none_secret_key_unsigned():
    """Test that None secret key is allowed if signed=False."""
    session_id = "session123"
    codeflash_output = check_session_id_signature(session_id, None, signed=False) # 1.09μs -> 778ns (40.7% faster)





def test_session_id_with_dot_at_end():
    """Test session id with dot at the end (empty signature) returns False."""
    secret_key = b'mysecret'
    session_id = "session123."
    codeflash_output = check_session_id_signature(session_id, secret_key, signed=True) # 21.2μs -> 18.2μs (16.3% faster)

def test_session_id_with_only_dot():
    """Test session id that is just a dot returns False."""
    secret_key = b'mysecret'
    session_id = "."
    codeflash_output = check_session_id_signature(session_id, secret_key, signed=True) # 13.1μs -> 11.2μs (17.3% faster)








def test_many_unsigned_session_ids():
    """Test many unsigned session ids for scalability."""
    for i in range(500):
        session_id = f"session{i}"
        codeflash_output = check_session_id_signature(session_id, b"irrelevant", signed=False) # 118μs -> 78.5μs (50.9% faster)

def test_random_invalid_formats():
    """Test random session_id formats that should return False when signed=True."""
    secret_key = b"largekey"
    bad_ids = [
        "",  # empty
        ".",  # only dot
        "abc",  # no dot
        "abc.",  # dot at end, empty signature
        ".abc",  # dot at start, empty base_id
        "abc.def.ghi",  # multiple dots, only first split matters
        "abc..def",  # double dot
    ]
    for session_id in bad_ids:
        codeflash_output = check_session_id_signature(session_id, secret_key, signed=True) # 37.9μs -> 33.2μs (14.2% faster)
import base64
import codecs
import hashlib
import hmac

# imports
import pytest
from bokeh.util.token import check_session_id_signature

# --- Function under test (copied from bokeh/util/token.py, with settings mocked for testability) ---

class DummySettings:
    def __init__(self, secret_key_bytes=None, sign_sessions=None):
        self._secret_key_bytes = secret_key_bytes
        self._sign_sessions = sign_sessions
    def secret_key_bytes(self):
        return self._secret_key_bytes
    def sign_sessions(self):
        return self._sign_sessions

# We'll use these dummy settings in tests instead of the real bokeh.settings.settings
settings = DummySettings(secret_key_bytes=b'supersecret', sign_sessions=True)
from bokeh.util.token import check_session_id_signature


# --- Helper for tests: generate a valid session_id with signature ---
def make_signed_session_id(base_id, secret_key):
    sig = _signature(base_id, secret_key)
    return f"{base_id}.{sig}"

# --- Pytest fixtures for settings isolation ---
@pytest.fixture
def signed_true_secret_key():
    # Use a fixed key for deterministic tests
    return b"test-secret-key"

# --- 1. Basic Test Cases ---



def test_unsigned_session_id(signed_true_secret_key):
    # If signed=False, any session_id should be accepted
    session_id = "anything.really"
    codeflash_output = check_session_id_signature(session_id, secret_key=signed_true_secret_key, signed=False) # 1.26μs -> 777ns (62.2% faster)
    codeflash_output = check_session_id_signature("no.dot", secret_key=signed_true_secret_key, signed=False) # 354ns -> 224ns (58.0% faster)

def test_none_secret_key_unsigned():
    # If signed=False, secret_key can be None
    codeflash_output = check_session_id_signature("foo.bar", secret_key=None, signed=False) # 745ns -> 480ns (55.2% faster)

def test_none_secret_key_signed():
    # If signed=True, secret_key=None should raise AssertionError in _signature
    with pytest.raises(AssertionError):
        check_session_id_signature("foo.bar", secret_key=None, signed=True) # 4.11μs -> 2.62μs (57.1% faster)

# --- 2. Edge Test Cases ---

def test_session_id_without_dot(signed_true_secret_key):
    # No dot: should return False if signed=True
    session_id = "no_dot_separator"
    codeflash_output = check_session_id_signature(session_id, secret_key=signed_true_secret_key, signed=True) # 1.34μs -> 970ns (38.2% faster)


def test_empty_string_session_id(signed_true_secret_key):
    # Empty session_id: should return False if signed=True
    codeflash_output = check_session_id_signature("", secret_key=signed_true_secret_key, signed=True) # 1.75μs -> 1.18μs (47.9% faster)


def test_empty_signature(signed_true_secret_key):
    # Empty signature part
    session_id = "abc."
    codeflash_output = check_session_id_signature(session_id, secret_key=signed_true_secret_key, signed=True) # 20.8μs -> 18.3μs (13.4% faster)







def test_signature_with_none_signature(signed_true_secret_key):
    # Signature part None (impossible via interface, but test for robustness)
    session_id = "abc123."
    codeflash_output = check_session_id_signature(session_id, secret_key=signed_true_secret_key, signed=True) # 20.8μs -> 18.1μs (15.2% faster)

# --- 3. Large Scale Test Cases ---





def test_performance_many_unsigned_session_ids():
    # Test that unsigned session ids are always accepted (performance)
    for i in range(1000):
        session_id = f"session{i}"
        codeflash_output = check_session_id_signature(session_id, secret_key=None, signed=False) # 218μs -> 163μs (33.3% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from bokeh.util.token import check_session_id_signature
import pytest

def test_check_session_id_signature():
    with pytest.raises(TypeError, match="a\\ bytes\\-like\\ object\\ is\\ required,\\ not\\ 'SymbolicBytes'"):
        check_session_id_signature('.', secret_key=b'', signed=True)

def test_check_session_id_signature_2():
    check_session_id_signature('', secret_key=b'', signed=True)

def test_check_session_id_signature_3():
    check_session_id_signature('', secret_key=b'', signed=None)
🔎 Concolic Coverage Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_sstvtaha/tmpu1rg1kdi/test_concolic_coverage.py::test_check_session_id_signature_2 1.64μs 1.30μs 25.9%✅
codeflash_concolic_sstvtaha/tmpu1rg1kdi/test_concolic_coverage.py::test_check_session_id_signature_3 932ns 623ns 49.6%✅

To edit these changes git checkout codeflash/optimize-check_session_id_signature-mhw6whva and push.

Codeflash Static Badge

The optimized code achieves a **32% speedup** through three key optimizations that reduce function call overhead and unnecessary work:

**1. Conditional Secret Key Processing**
The original code always called `_ensure_bytes(secret_key)` at the function start, even for unsigned sessions (the majority case). The optimized version only processes the secret key when `signed=True`, eliminating this overhead for ~83% of calls based on the test patterns.

**2. String Encoding Optimization**
Replaced `codecs.encode(string, 'utf-8')` with `string.encode('utf-8')` in both `_ensure_bytes` and `_signature`. The `.encode()` method is a direct C-level operation that's significantly faster than the more generic `codecs.encode()` function call.

**3. Inlined Base64 Encoding**
The original code called `_base64_encode(signer.digest())` which had additional overhead from function calls and used `codecs.decode()`. The optimized version inlines this logic using direct `base64.urlsafe_b64encode()` and `.decode('ascii')`, eliminating the extra function call and codec overhead.

**Performance Impact by Use Case:**
- **Unsigned sessions** (most common): 50-90% faster due to eliminating unnecessary secret key processing
- **Invalid signed sessions**: 15-40% faster from encoding optimizations
- **Valid signed sessions**: Moderate gains from string encoding and base64 inlining

**Hot Path Context:**
Given that `check_session_id_signature` is called from `check_token_signature` (which validates both tokens and their contained session IDs), this optimization directly benefits Bokeh server authentication workflows where session validation occurs frequently during client-server interactions.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 12, 2025 16:03
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash labels Nov 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant