Bug: interactive mode can appear hung on first prompt, then exit after cleanup with misleading `Goodbye`

## Summary

In standalone interactive mode (`openspace` without `--query`), the first user request can appear to hang for a long time with no visible output. In some runs, the process then exits after cleanup and prints `Goodbye!`, which looks like an unexpected shutdown rather than a handled task failure.

This is not a single root cause. There are two issues layered together:

1. The first-request path performs LLM-based skill selection before normal task execution, and this step can be cancelled or take too long in some environments.
2. When that early phase fails, `tool_layer.execute()` can throw a secondary exception during cleanup/final result assembly, which hides the original failure.

## Environment

- Project: `OpenSpace`
- Mode: standalone CLI, interactive mode
- Launch command:

```bash
openspace
```

- User then enters a first prompt such as:

```text
如何使用 opensapce
```

- Local model config in `openspace/.env` used an OpenAI-compatible Zhipu endpoint:

```env
OPENSPACE_MODEL=openai/glm-5.1
OPENSPACE_LLM_API_BASE=https://open.bigmodel.cn/api/coding/paas/v4
OPENSPACE_LLM_API_KEY=...
OPENSPACE_WORKSPACE=/Users/javazhao/OpenSpace
OPENSPACE_BACKEND_SCOPE=shell,mcp,system
```

## Expected behavior

- The first prompt should either:
  - respond normally, or
  - fail fast with a clear error message
- The CLI should not appear stuck with no feedback for an extended period
- Cleanup should not mask the original exception

## Actual behavior

- After entering the first prompt, the CLI can sit in a running state for a long time with no visible answer
- In failure cases, the original error is obscured by a secondary exception in cleanup/finalization
- The final `Goodbye!` message makes the shutdown look intentional, even though the task path failed

## Relevant logs

Example failure sequence from interactive mode:

- skill selection starts before task execution
- LLM request stalls/cancels
- cleanup path throws secondary exception

Relevant stack trace:

```text
asyncio.exceptions.CancelledError
...
File "openspace/skill_engine/registry.py", line 466, in select_skills_with_llm
    resp = await llm_client.complete(prompt, **llm_kwargs)
...
File "openspace/tool_layer.py", line 584, in execute
    "execution_time": execution_time,
UnboundLocalError: cannot access local variable 'execution_time' where it is not associated with a value
```

Relevant log behavior:

```text
UI monitoring started
Task: 如何使用 opensapce...
...
Error: cannot access local variable 'execution_time' where it is not associated with a value
```

## Root cause analysis

### 1. First-request skill selection is on the critical path

`OpenSpace.execute()` runs `_select_and_inject_skills()` before the main task execution path when a skill registry is present.

That means the very first user prompt blocks on LLM-based skill selection before the user sees a real answer.

Relevant code:

- `openspace/tool_layer.py`
- `openspace/skill_engine/registry.py`

### 2. Cancellation/timeout in skill selection is not handled explicitly

`select_skills_with_llm()` catches `Exception`, but the failure observed here is `asyncio.CancelledError`.

In practice, this can escape the normal "selection failed, proceed without skills" path depending on runtime behavior.

Relevant code:

```python
resp = await llm_client.complete(prompt, **llm_kwargs)
```

### 3. Cleanup/final result path can hide the original error

In `tool_layer.execute()`, `execution_time` and `result` are used later in the function. If an exception is raised early enough, the cleanup/final result path can raise a second exception, which hides the original cause from the user.

Observed secondary exception:

```text
UnboundLocalError: cannot access local variable 'execution_time'
```

## Why this is confusing for users

- The terminal appears hung because the first visible work is a hidden skill-selection LLM call
- The user gets little or no progress feedback in interactive mode
- The final `Goodbye!` output suggests a normal exit, even though the real issue happened earlier

## Suggested fixes

### Fix 1: make skill selection fail fast

Use a dedicated short-timeout LLM client for skill selection instead of reusing the main task LLM client.

This keeps the first prompt responsive.

### Fix 2: explicitly handle `asyncio.CancelledError`

Treat cancellation/timeout in skill selection as:

- log warning
- skip skills
- continue with tool-only execution

### Fix 3: stabilize cleanup/final result assembly

Initialize `result` and `execution_time` before the main try block so cleanup never throws a secondary exception.

### Fix 4: improve interactive-mode feedback

Before skill selection starts, print a short status such as:

```text
Selecting relevant skills...
```

If selection fails:

```text
Skill selection failed, continuing without skills.
```

## Minimal reproduction

1. Configure standalone OpenSpace with a real LLM endpoint in `openspace/.env`
2. Launch:

```bash
openspace
```

3. Enter:

```text
如何使用 opensapce
```

4. Observe long no-output period on the first prompt
5. In failure runs, observe cleanup exit and secondary exception instead of a clear task failure

## Additional note

This issue is distinct from provider/model routing issues.

For example, a separate configuration issue exists when using a bare model name like `glm-5.1` with LiteLLM against an OpenAI-compatible endpoint. That problem is about provider-qualified model naming. This issue is about interactive-mode robustness and error handling after startup succeeds.

## Affected files

- [[openspace/tool_layer.py](https://github.com/Users/javazhao/OpenSpace/openspace/tool_layer.py)](/Users/javazhao/OpenSpace/openspace/tool_layer.py)
- [[openspace/skill_engine/registry.py](https://github.com/Users/javazhao/OpenSpace/openspace/skill_engine/registry.py)](/Users/javazhao/OpenSpace/openspace/skill_engine/registry.py)
- [openspace/__main__.py](/Users/javazhao/OpenSpace/openspace/__main__.py)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: interactive mode can appear hung on first prompt, then exit after cleanup with misleading `Goodbye` #54

Summary

Environment

Expected behavior

Actual behavior

Relevant logs

Root cause analysis

1. First-request skill selection is on the critical path

2. Cancellation/timeout in skill selection is not handled explicitly

3. Cleanup/final result path can hide the original error

Why this is confusing for users

Suggested fixes

Fix 1: make skill selection fail fast

Fix 2: explicitly handle `asyncio.CancelledError`

Fix 3: stabilize cleanup/final result assembly

Fix 4: improve interactive-mode feedback

Minimal reproduction

Additional note

Affected files

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bug: interactive mode can appear hung on first prompt, then exit after cleanup with misleading Goodbye #54

Description

Summary

Environment

Expected behavior

Actual behavior

Relevant logs

Root cause analysis

1. First-request skill selection is on the critical path

2. Cancellation/timeout in skill selection is not handled explicitly

3. Cleanup/final result path can hide the original error

Why this is confusing for users

Suggested fixes

Fix 1: make skill selection fail fast

Fix 2: explicitly handle asyncio.CancelledError

Fix 3: stabilize cleanup/final result assembly

Fix 4: improve interactive-mode feedback

Minimal reproduction

Additional note

Affected files

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Bug: interactive mode can appear hung on first prompt, then exit after cleanup with misleading `Goodbye` #54

Fix 2: explicitly handle `asyncio.CancelledError`