Skip to content

Parallel web_search calls hang in Amplifier's async tool executor #219

@Joi

Description

@Joi

Summary

When the LLM fires multiple web_search tool calls in parallel (e.g., 9-10 simultaneous searches), all searches hang indefinitely — no results return, no timeout fires, and the session dies. This is reproducible on resume.

Affected module: microsoft/amplifier-module-tool-web

Environment

  • macOS (Apple Silicon)
  • Python 3.11
  • ddgs 9.10.0 + primp 1.0.0
  • Amplifier CLI latest (as of 2026-02-15)

Root Cause Analysis

The WebSearchTool._real_search() method at __init__.py:103-123 has three issues:

  1. No concurrency limit: Each search creates a new DDGS() instance (which internally creates a primp Rust HTTP client) and dispatches it to the default ThreadPoolExecutor via run_in_executor(None, ...). When 9-10 run simultaneously, they appear to deadlock — likely due to thread pool starvation or contention in primp's Rust runtime.

  2. No timeout: If run_in_executor hangs, it hangs forever. There's no asyncio.wait_for() wrapper, so the session becomes permanently stuck.

  3. Deprecated API: Uses asyncio.get_event_loop() instead of asyncio.get_running_loop().

Reproduction

  • Have an LLM session make 9+ parallel web_search tool calls (e.g., looking up phone numbers for multiple businesses)
  • The tool:pre events fire but zero tool:post events are recorded
  • Session becomes unresponsive

Note: The same 9 parallel searches work fine in a standalone asyncio.run() test script. The hang is specific to Amplifier's runtime event loop / thread pool context.

Suggested Fix

class WebSearchTool:
    _search_semaphore: asyncio.Semaphore | None = None
    _SEARCH_TIMEOUT = 30
    _MAX_CONCURRENT = 3

    def __init__(self, config):
        ...
        if WebSearchTool._search_semaphore is None:
            WebSearchTool._search_semaphore = asyncio.Semaphore(self._MAX_CONCURRENT)

    async def _real_search(self, query):
        try:
            def search_sync():
                ddgs = DDGS()
                return [...]

            async with self._search_semaphore:
                loop = asyncio.get_running_loop()
                results = await asyncio.wait_for(
                    loop.run_in_executor(None, search_sync),
                    timeout=self._SEARCH_TIMEOUT,
                )
            return results

        except TimeoutError:
            logger.warning(f"DuckDuckGo search timed out after {self._SEARCH_TIMEOUT}s: {query}")
            return await self._mock_search(query)
        except Exception as e:
            logger.warning(f"DuckDuckGo search failed: {e}, falling back to mock")
            return await self._mock_search(query)

Key changes:

  • Semaphore (max 3 concurrent) prevents thread pool starvation
  • asyncio.wait_for() with 30s timeout prevents infinite hangs, falls back to mock
  • asyncio.get_running_loop() replaces deprecated get_event_loop()

Additional Note: primp/ddgs version mismatch

There's also a cosmetic issue: ddgs 9.10.0 requests browser impersonation profiles (e.g., chrome_126) that primp 1.0.0 doesn't recognize, producing Impersonate 'chrome_126' does not exist, using 'random' warnings. The random fallback works fine — this is noisy but not the cause of the hang. The uv.lock pins ddgs==9.9.2 + primp==0.15.0 (compatible pair) but the installed versions have drifted.


🤖 Generated with Amplifier

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions