Improve tool efficiency: connection pooling, parallel execution, dedu… #305

Varahiskillhub · 2026-02-05T22:06:30Z

…p pre-filter, memory compression

Add string-similarity pre-filter to vulnerability deduplication to limit LLM comparisons to the top 10 most similar reports instead of all reports
Replace per-request httpx.AsyncClient with persistent connection pool per sandbox, eliminating repeated TCP/TLS handshake overhead
Execute independent tools concurrently via asyncio.gather while keeping state-modifying tools sequential
Lower memory compression threshold from 100K to 60K tokens and cache token counts to avoid redundant litellm.token_counter calls
Double compression chunk size from 10 to 20 messages to halve LLM calls
Replace asyncio.sleep(0.5) polling with event-based wake signaling in agent state for immediate response to state changes

…p pre-filter, memory compression - Add string-similarity pre-filter to vulnerability deduplication to limit LLM comparisons to the top 10 most similar reports instead of all reports - Replace per-request httpx.AsyncClient with persistent connection pool per sandbox, eliminating repeated TCP/TLS handshake overhead - Execute independent tools concurrently via asyncio.gather while keeping state-modifying tools sequential - Lower memory compression threshold from 100K to 60K tokens and cache token counts to avoid redundant litellm.token_counter calls - Double compression chunk size from 10 to 20 messages to halve LLM calls - Replace asyncio.sleep(0.5) polling with event-based wake signaling in agent state for immediate response to state changes https://claude.ai/code/session_012JYGtxVh4zRbzXKarNmb11

greptile-apps · 2026-02-05T22:09:04Z

Greptile Overview

Greptile Summary

This PR implements several performance optimizations to reduce latency and LLM API costs:

Connection pooling: Persistent HTTP clients per sandbox eliminate repeated TCP/TLS handshakes (executor.py:30-52)
Parallel tool execution: Independent tools run concurrently via asyncio.gather while state-modifying tools remain sequential (executor.py:336-406)
Event-based signaling: Replaced asyncio.sleep(0.5) polling with asyncio.Event for immediate wake on state changes (state.py:46, base_agent.py:278)
Deduplication pre-filter: String similarity limits LLM comparisons to top 10 most similar vulnerability reports instead of all reports (dedupe.py:144-194)
Memory compression tuning: Lowered threshold from 100K to 60K tokens, doubled chunk size from 10 to 20 messages, and cached token counts (memory_compressor.py:12-62)

Issues found:

close_sandbox_client function defined but never called, causing connection pool resource leaks when sandboxes are torn down
_token_cache in memory_compressor.py grows unbounded without eviction strategy

Confidence Score: 3/5

Generally safe performance improvements with two resource leak issues that need resolution before production use
The optimizations are well-designed and properly implement connection pooling, parallelization, and event-based signaling. However, the missing cleanup mechanism for HTTP connection pools and unbounded token cache growth are production-ready concerns that could cause memory/connection leaks in long-running systems.
Pay close attention to strix/tools/executor.py (connection pool cleanup) and strix/llm/memory_compressor.py (cache eviction)

Important Files Changed

Filename	Overview
strix/tools/executor.py	Added HTTP connection pooling per sandbox and parallel tool execution. Missing cleanup mechanism for persistent connections.
strix/agents/state.py	Replaced polling with event-based signaling using asyncio.Event. Properly excluded from serialization.
strix/llm/dedupe.py	Added string similarity pre-filtering to limit LLM comparisons to top 10 candidates. Efficient and well-implemented.
strix/llm/memory_compressor.py	Added token count caching and increased compression chunk size. Cache grows unbounded without eviction strategy.
strix/agents/base_agent.py	Replaced sleep polling with event-based wait. Clean, minimal change that improves responsiveness.

greptile-apps

_{5 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-05T22:09:07Z

strix/tools/executor.py

+async def close_sandbox_client(sandbox_id: str) -> None:
+    """Close and remove the HTTP client for a sandbox when it's torn down."""
+    client = _sandbox_clients.pop(sandbox_id, None)
+    if client:
+        await client.aclose()


close_sandbox_client is defined but never called in the codebase. Connection pool clients accumulate without cleanup when sandboxes are torn down, leading to resource leaks.

Prompt To Fix With AI

This is a comment left during a code review. Path: strix/tools/executor.py Line: 48:52 Comment: `close_sandbox_client` is defined but never called in the codebase. Connection pool clients accumulate without cleanup when sandboxes are torn down, leading to resource leaks. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-05T22:09:09Z

strix/llm/memory_compressor.py

+_token_cache: dict[int, int] = {}
+
+
 def _count_tokens(text: str, model: str) -> int:
+    cache_key = hash(text)
+    if cache_key in _token_cache:
+        return _token_cache[cache_key]
+
    try:
-        count = litellm.token_counter(model=model, text=text)
-        return int(count)
+        count = int(litellm.token_counter(model=model, text=text))
    except Exception:
        logger.exception("Failed to count tokens")
-        return len(text) // 4  # Rough estimate
+        count = len(text) // 4  # Rough estimate
+
+    _token_cache[cache_key] = count
+    return count


_token_cache grows unbounded. For long-running agents with many unique messages, this will consume increasing memory. Consider adding LRU eviction or size limits.

Prompt To Fix With AI

This is a comment left during a code review. Path: strix/llm/memory_compressor.py Line: 47:62 Comment: `_token_cache` grows unbounded. For long-running agents with many unique messages, this will consume increasing memory. Consider adding LRU eviction or size limits. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps bot reviewed Feb 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve tool efficiency: connection pooling, parallel execution, dedu… #305

Improve tool efficiency: connection pooling, parallel execution, dedu… #305

Uh oh!

Varahiskillhub commented Feb 5, 2026

Uh oh!

greptile-apps bot commented Feb 5, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Feb 5, 2026

Uh oh!

greptile-apps bot Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Improve tool efficiency: connection pooling, parallel execution, dedu… #305

Are you sure you want to change the base?

Improve tool efficiency: connection pooling, parallel execution, dedu… #305

Uh oh!

Conversation

Varahiskillhub commented Feb 5, 2026

Uh oh!

greptile-apps bot commented Feb 5, 2026

Greptile Overview

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants