Skip to content

feat(restart): add automatic restart policies for crash recovery#436

Draft
uran0sH wants to merge 1 commit intoboxlite-ai:mainfrom
uran0sH:restart-dev
Draft

feat(restart): add automatic restart policies for crash recovery#436
uran0sH wants to merge 1 commit intoboxlite-ai:mainfrom
uran0sH:restart-dev

Conversation

@uran0sH
Copy link
Copy Markdown
Contributor

@uran0sH uran0sH commented Apr 9, 2026

Implement restart policies (No, Always, OnFailure, UnlessStopped) with exponential backoff and crash detection via health check.

Key changes:

  • Add RestartPolicy enum, StopCause, StopInfo to state model
  • Crash handler task (per-Runtime) evaluates policy on shim death
  • Startup recovery evaluates persisted crash state on Runtime::new()
  • pending_crashes HashSet prevents duplicate concurrent crash handling
  • Health check only detects and notifies (no state mutation)
  • Per-box file lock ensures mutual exclusion (callers hold before restart())
  • Auto-enable health check when restart_policy is set without health_check
  • SDK changes (Python, Node, C) for new status strings and tokio runtime context

issue: #32

@uran0sH uran0sH marked this pull request as draft April 10, 2026 01:47
@uran0sH uran0sH force-pushed the restart-dev branch 2 times, most recently from 4147452 to a61d047 Compare April 10, 2026 06:17
Implement restart policies (No, Always, OnFailure, UnlessStopped)
with exponential backoff and crash detection via health check.

Key changes:
- Add RestartPolicy enum, StopCause, StopInfo to state model
- Crash handler task (per-Runtime) evaluates policy on shim death
- Startup recovery evaluates persisted crash state on Runtime::new()
- pending_crashes HashSet prevents duplicate concurrent crash handling
- Health check only detects and notifies (no state mutation)
- Per-box file lock ensures mutual exclusion (callers hold before restart())
- Auto-enable health check when restart_policy is set without health_check
- SDK changes (Python, Node, C) for new status strings and tokio runtime context

Signed-off-by: Wenyu Huang <huangwenyuu@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant