Skip to content

fix(lifeline): self-heal Telegram.Lifeline degradations on successful send#241

Open
brianjones-v4n wants to merge 1 commit into
JKHeadley:mainfrom
brianjones-v4n:fix/lifeline-self-heal
Open

fix(lifeline): self-heal Telegram.Lifeline degradations on successful send#241
brianjones-v4n wants to merge 1 commit into
JKHeadley:mainfrom
brianjones-v4n:fix/lifeline-self-heal

Conversation

@brianjones-v4n
Copy link
Copy Markdown
Contributor

Summary

When the agent successfully sends a message to its configured lifeline topic, that's positive proof the lifeline is reachable. Today Telegram.Lifeline degradation events get reported on first failure but never cleared automatically — they sit stale even after the lifeline recovers.

This adds a self-heal pathway:

  • DegradationReporter gains a public clearByFeature(feature: string): number method that removes events in-memory and from degradations.json on disk.
  • TelegramAdapter.apiCall calls clearByFeature('Telegram.Lifeline') after a successful sendMessage/sendChatAction to lifelineTopicId.
  • Drive-by fix in DegradationReporter.render: dedupe leading "Using" so a fallback string that already starts with "Using" doesn't render as "Using Using ...".

Test plan

  • Unit tests assert clearByFeature exists, filters events, persists to disk
  • Test asserts TelegramAdapter.apiCall calls clearByFeature('Telegram.Lifeline') in its success path
  • Test asserts the render dedupe regex is present
  • npx tsc --noEmit — no new errors (9 pre-existing errors remain, all from missing optional deps: js-yaml, @scure/bip39, moltbridge, marked, AgentMdJobLoader implicit any)

🤖 Generated with Claude Code

… send

When the agent successfully sends a message to its configured lifeline
topic, that's positive proof the lifeline is reachable. Clear any
existing Telegram.Lifeline degradation events automatically rather
than leaving them stale.

- DegradationReporter: new public clearByFeature(feature) method that
  removes events in-memory and from disk, returning the count cleared
- DegradationReporter.render: dedupe leading "Using" so we don't render
  "Using Using ..." when a fallback string already starts with "Using"
- TelegramAdapter.apiCall: after a successful sendMessage/sendChatAction
  to the lifeline topic, call clearByFeature('Telegram.Lifeline')

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 16, 2026

@brianjones-v4n is attempting to deploy a commit to the sagemind Team on Vercel.

A member of the Team first needs to authorize it.

@brianjones-v4n
Copy link
Copy Markdown
Contributor Author

CI failure note — appears to be an unrelated flake.

The failing assertion is in tests/e2e/phase4-multi-machine-coordination.test.ts:851:

expected null to be truthy

…on machineB.claimManager.tryClaim('daily-sync'). This is a multi-machine claim-leadership race; this PR only touches DegradationReporter and TelegramAdapter, with no path through the claim manager or any shared multi-machine state.

Evidence it's a flake:

  • The same E2E suite passed on the immediately prior CI run (25973796885) and the one before that (25970080683) with no related code changes between.
  • All 2,089 of 2,090 E2E tests passed in this run; the single failure is on a known-timing-sensitive claim test, not a deterministic assertion against my changes.

Would you mind kicking the failed job? Happy to push an empty commit to retrigger CI if that's easier on your end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant