Summary
In standalone interactive mode (openspace without --query), the first user request can appear to hang for a long time with no visible output. In some runs, the process then exits after cleanup and prints Goodbye!, which looks like an unexpected shutdown rather than a handled task failure.
This is not a single root cause. There are two issues layered together:
- The first-request path performs LLM-based skill selection before normal task execution, and this step can be cancelled or take too long in some environments.
- When that early phase fails,
tool_layer.execute() can throw a secondary exception during cleanup/final result assembly, which hides the original failure.
Environment
- Project:
OpenSpace
- Mode: standalone CLI, interactive mode
- Launch command:
- User then enters a first prompt such as:
- Local model config in
openspace/.env used an OpenAI-compatible Zhipu endpoint:
OPENSPACE_MODEL=openai/glm-5.1
OPENSPACE_LLM_API_BASE=https://open.bigmodel.cn/api/coding/paas/v4
OPENSPACE_LLM_API_KEY=...
OPENSPACE_WORKSPACE=/Users/javazhao/OpenSpace
OPENSPACE_BACKEND_SCOPE=shell,mcp,system
Expected behavior
- The first prompt should either:
- respond normally, or
- fail fast with a clear error message
- The CLI should not appear stuck with no feedback for an extended period
- Cleanup should not mask the original exception
Actual behavior
- After entering the first prompt, the CLI can sit in a running state for a long time with no visible answer
- In failure cases, the original error is obscured by a secondary exception in cleanup/finalization
- The final
Goodbye! message makes the shutdown look intentional, even though the task path failed
Relevant logs
Example failure sequence from interactive mode:
- skill selection starts before task execution
- LLM request stalls/cancels
- cleanup path throws secondary exception
Relevant stack trace:
asyncio.exceptions.CancelledError
...
File "openspace/skill_engine/registry.py", line 466, in select_skills_with_llm
resp = await llm_client.complete(prompt, **llm_kwargs)
...
File "openspace/tool_layer.py", line 584, in execute
"execution_time": execution_time,
UnboundLocalError: cannot access local variable 'execution_time' where it is not associated with a value
Relevant log behavior:
UI monitoring started
Task: 如何使用 opensapce...
...
Error: cannot access local variable 'execution_time' where it is not associated with a value
Root cause analysis
1. First-request skill selection is on the critical path
OpenSpace.execute() runs _select_and_inject_skills() before the main task execution path when a skill registry is present.
That means the very first user prompt blocks on LLM-based skill selection before the user sees a real answer.
Relevant code:
openspace/tool_layer.py
openspace/skill_engine/registry.py
2. Cancellation/timeout in skill selection is not handled explicitly
select_skills_with_llm() catches Exception, but the failure observed here is asyncio.CancelledError.
In practice, this can escape the normal "selection failed, proceed without skills" path depending on runtime behavior.
Relevant code:
resp = await llm_client.complete(prompt, **llm_kwargs)
3. Cleanup/final result path can hide the original error
In tool_layer.execute(), execution_time and result are used later in the function. If an exception is raised early enough, the cleanup/final result path can raise a second exception, which hides the original cause from the user.
Observed secondary exception:
UnboundLocalError: cannot access local variable 'execution_time'
Why this is confusing for users
- The terminal appears hung because the first visible work is a hidden skill-selection LLM call
- The user gets little or no progress feedback in interactive mode
- The final
Goodbye! output suggests a normal exit, even though the real issue happened earlier
Suggested fixes
Fix 1: make skill selection fail fast
Use a dedicated short-timeout LLM client for skill selection instead of reusing the main task LLM client.
This keeps the first prompt responsive.
Fix 2: explicitly handle asyncio.CancelledError
Treat cancellation/timeout in skill selection as:
- log warning
- skip skills
- continue with tool-only execution
Fix 3: stabilize cleanup/final result assembly
Initialize result and execution_time before the main try block so cleanup never throws a secondary exception.
Fix 4: improve interactive-mode feedback
Before skill selection starts, print a short status such as:
Selecting relevant skills...
If selection fails:
Skill selection failed, continuing without skills.
Minimal reproduction
- Configure standalone OpenSpace with a real LLM endpoint in
openspace/.env
- Launch:
- Enter:
- Observe long no-output period on the first prompt
- In failure runs, observe cleanup exit and secondary exception instead of a clear task failure
Additional note
This issue is distinct from provider/model routing issues.
For example, a separate configuration issue exists when using a bare model name like glm-5.1 with LiteLLM against an OpenAI-compatible endpoint. That problem is about provider-qualified model naming. This issue is about interactive-mode robustness and error handling after startup succeeds.
Affected files
Summary
In standalone interactive mode (
openspacewithout--query), the first user request can appear to hang for a long time with no visible output. In some runs, the process then exits after cleanup and printsGoodbye!, which looks like an unexpected shutdown rather than a handled task failure.This is not a single root cause. There are two issues layered together:
tool_layer.execute()can throw a secondary exception during cleanup/final result assembly, which hides the original failure.Environment
OpenSpaceopenspace/.envused an OpenAI-compatible Zhipu endpoint:Expected behavior
Actual behavior
Goodbye!message makes the shutdown look intentional, even though the task path failedRelevant logs
Example failure sequence from interactive mode:
Relevant stack trace:
Relevant log behavior:
Root cause analysis
1. First-request skill selection is on the critical path
OpenSpace.execute()runs_select_and_inject_skills()before the main task execution path when a skill registry is present.That means the very first user prompt blocks on LLM-based skill selection before the user sees a real answer.
Relevant code:
openspace/tool_layer.pyopenspace/skill_engine/registry.py2. Cancellation/timeout in skill selection is not handled explicitly
select_skills_with_llm()catchesException, but the failure observed here isasyncio.CancelledError.In practice, this can escape the normal "selection failed, proceed without skills" path depending on runtime behavior.
Relevant code:
3. Cleanup/final result path can hide the original error
In
tool_layer.execute(),execution_timeandresultare used later in the function. If an exception is raised early enough, the cleanup/final result path can raise a second exception, which hides the original cause from the user.Observed secondary exception:
Why this is confusing for users
Goodbye!output suggests a normal exit, even though the real issue happened earlierSuggested fixes
Fix 1: make skill selection fail fast
Use a dedicated short-timeout LLM client for skill selection instead of reusing the main task LLM client.
This keeps the first prompt responsive.
Fix 2: explicitly handle
asyncio.CancelledErrorTreat cancellation/timeout in skill selection as:
Fix 3: stabilize cleanup/final result assembly
Initialize
resultandexecution_timebefore the main try block so cleanup never throws a secondary exception.Fix 4: improve interactive-mode feedback
Before skill selection starts, print a short status such as:
If selection fails:
Minimal reproduction
openspace/.envAdditional note
This issue is distinct from provider/model routing issues.
For example, a separate configuration issue exists when using a bare model name like
glm-5.1with LiteLLM against an OpenAI-compatible endpoint. That problem is about provider-qualified model naming. This issue is about interactive-mode robustness and error handling after startup succeeds.Affected files