model_context_window_exceeded is treated as retriable transient error

## Bug Description

When a worker hits the model's context window limit (`model_context_window_exceeded`), Spacebot treats it as a transient API error and retries with exponential backoff (5, 10, 20, 40, 80 seconds). Since the context is the same size on each retry, it fails every time, wasting ~3 minutes of retries and consuming API credits.

## Reproduction

1. Spawn a worker that performs commands with large output (e.g., journalctl with hundreds of lines)
2. Worker accumulates enough tool results to exceed the model context window
3. The API returns `stop_reason: model_context_window_exceeded` with empty content
4. Spacebot retries 5 times with the same bloated context, failing each time

## Expected Behavior

- `model_context_window_exceeded` should be detected as **non-retriable** — this is a context length problem, not a transient API error
- On context overflow, either: (a) truncate/summarize conversation history and retry once, or (b) fail fast
- No exponential backoff for a condition that won't resolve by waiting

## Actual Behavior

- Treated as transient error, retried 5 times with delays of 5, 10, 20, 40, 80 seconds
- All retries fail with the same error
- Total wasted time: ~155 seconds
- Total wasted API calls: 5

## Log Evidence

```
WARN spacebot::llm::model: unexpected empty assistant_content from Anthropic stop_reason="model_context_window_exceeded"
WARN spacebot::agent::worker: transient provider error, backing off and retrying attempt=1..5 delay_secs=5..80
ERROR spacebot::agent::worker: worker transient error retries exhausted retries=5
```

## Impact

- Wastes API credits on guaranteed-to-fail retries
- Adds ~3 minutes of delay before the user gets the failure
- Can cascade: the main agent then tries to retrigger with the same bloated context

## Suggested Fix

1. Match on `stop_reason == "model_context_window_exceeded"` and classify as non-retriable
2. Optionally: implement automatic context truncation (summarize oldest messages) and retry once
3. Alternatively: add a `max_input_tokens` check before API calls to proactively prevent overflow

## Environment

- Spacebot version: 0.3.3
- Model: zai_anthropic/glm-5-turbo
- Observed when workers run journalctl with large output

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model_context_window_exceeded is treated as retriable transient error #503

Bug Description

Reproduction

Expected Behavior

Actual Behavior

Log Evidence

Impact

Suggested Fix

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

model_context_window_exceeded is treated as retriable transient error #503

Description

Bug Description

Reproduction

Expected Behavior

Actual Behavior

Log Evidence

Impact

Suggested Fix

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions