Skip to content

fix(agent): prune excess images from history to prevent session deadlock#2613

Open
iceymoss wants to merge 3 commits into
charmbracelet:mainfrom
iceymoss:fix/chat-unusable-exceed-image-limit
Open

fix(agent): prune excess images from history to prevent session deadlock#2613
iceymoss wants to merge 3 commits into
charmbracelet:mainfrom
iceymoss:fix/chat-unusable-exceed-image-limit

Conversation

@iceymoss
Copy link
Copy Markdown
Contributor

@iceymoss iceymoss commented Apr 13, 2026

Summary

  • Fix session becoming permanently unusable when conversation image count exceeds the provider's per-request limit (e.g. Gemini's 10-image cap)
  • Add automatic oldest-first image pruning before sending requests to the provider
  • Add user-configurable max_images field in crush.json to override provider defaults

Problem

When using image-capable models with per-request image limits (like Google Gemini which caps at 10 images), the chat session becomes completely unusable once the conversation
accumulates more images than the limit allows.

Root cause: preparePrompt() in agent.go converts all historical messages — including every image ever sent — into the outgoing API request. No layer between the database
and the provider checks or limits the image count. Once exceeded, the provider returns an error like:

too many images: maximum allowed for model gemini-3.1-pro-preview is 10, got 11

The error is caught and displayed, but because the images are permanently stored in the database, every subsequent request (even text-only follow-ups) re-sends the same
oversized history and hits the same error — creating an unrecoverable loop.

Bug flow:
User sends message → →→ Load history (11 images) →→ → Send to Gemini →→ → "too many images" error
↑ ↓
└──────────── User sends another message ←───────────────────────────────┘

Solution

Add pruneExcessImages() which runs in PrepareStep (right after workaroundProviderMediaLimitations). It:

  1. Counts all image FileParts across the outgoing message history
  2. If the count exceeds the provider's limit, replaces the oldest images with a text placeholder [Earlier image removed to stay within model limits]
  3. Only modifies the sent history — persisted messages in the database are untouched

The image limit is resolved with a two-tier priority:

Priority Source Example
1st User config (max_images in crush.json) "max_images": 20
2nd Built-in provider default Gemini/VertexAI → 10, others → unlimited

This means users can override the default if their model supports more images:

{
  "models": {
    "large": {
      "model": "gemini-2.5-pro",
      "provider": "gemini",
      "max_images": 20
    }
  }
}

Changes

  • internal/agent/agent.go — Add pruneExcessImages(), countImagesInMessages(), isImageFilePart(), maxImagesForProvider() with defaultMaxImages map; call pruning in PrepareStep
  • internal/config/config.go — Add MaxImages field to SelectedModel struct (json:"max_images,omitempty")
  • internal/agent/prune_images_test.go — 25 new tests (see below)

Test plan

Unit tests (12) — core logic

  • maxImagesForProvider returns correct limits per provider
  • maxImagesForProvider respects user config override
  • isImageFilePart distinguishes image vs text file parts
  • countImagesInMessages counts correctly across messages
  • pruneExcessImages no-ops for unlimited providers
  • pruneExcessImages no-ops when under/at limit
  • pruneExcessImages removes oldest images first when over limit
  • pruneExcessImages preserves text files, message roles, and structure

Integration tests (3) — DB → preparePrompt → prune pipeline

  • 12 images in DB history → pruned to 10 after pipeline
  • 5 images in DB history → no pruning
  • 11 images + text-only follow-up → pruned, session stays usable

E2E test (1) — full sessionAgent.Run() with mock Gemini model

  • Mock LanguageModel enforces 10-image limit (returns error if exceeded)
  • 12 rounds of image messages → mock never receives >10 images
  • Text-only message after exceeding limit → succeeds (session not stuck)
  • All 26 messages persisted correctly in database

Regression

  • All existing agent tests pass
  • All config package tests pass
  • Full project go build ./... succeeds

Fixes #2604


  When a conversation accumulates more images than a provider's per-request
  limit (e.g. Gemini's 10-image cap), every subsequent request — including
  text-only follow-ups — fails with "too many images" because the full
  message history (with all historical images) is re-sent each time. This
  makes the session permanently unusable.

  Root cause: preparePrompt() converts all historical messages including
  their image attachments into the outgoing request, and no layer between
  the database and the provider checks or caps the image count. Once the
  limit is exceeded, the error persists in a loop because the offending
  images remain in the stored history.

  Fix: add pruneExcessImages() which runs in PrepareStep right after
  workaroundProviderMediaLimitations(). It counts image FileParts across
  the outgoing messages and, when the total exceeds the provider's known
  limit, replaces the oldest images with a text placeholder. Only the sent
  history is modified; persisted messages in the database are untouched.

  The limit is determined by:
  1. User-configured `max_images` in crush.json (takes priority), or
  2. A built-in provider default (Gemini/VertexAI: 10, others: unlimited)

  This lets users override the default when a model supports more (or
  fewer) images than the hard-coded fallback.

  Fixes charmbracelet#2604
@iceymoss iceymoss requested a review from a team as a code owner April 13, 2026 06:51
@iceymoss iceymoss requested review from aymanbagabas and meowgorithm and removed request for a team April 13, 2026 06:51
…ialect

  The CI runs tests with -race. Our DB pipeline and E2E tests call
  testEnv() which invokes goose.SetDialect(), a write to a package-level
  global. Running these tests in parallel triggers a data race.

  Also fix gofumpt formatting: add required blank lines between method
  declarations on geminiMockModel.
@iceymoss
Copy link
Copy Markdown
Contributor Author


A note on the impact of image pruning

Only the raw image data of the oldest images is removed from the sent history — the assistant's prior text responses (which describe what was in those images) are fully preserved.

Example: Suppose a conversation has 12 images and we need to prune the 2 oldest:

Before pruning (12 images — exceeds Gemini's 10-image limit)

User: [image-1] Can you check this error?
Assistant: I see a NullPointerException in Main.java at line 42... ← kept
User: [image-2] Here's the result after my fix
Assistant: The error is gone, but there's a new warning on line 55... ← kept
User: [image-3] ...
...
User: [image-12] Final version
Assistant: All issues are resolved. ← kept

After pruning (10 images — within limit)

User: [placeholder] Can you check this error?
Assistant: I see a NullPointerException in Main.java at line 42... ← kept
User: [placeholder] Here's the result after my fix
Assistant: The error is gone, but there's a new warning on line 55... ← kept
User: [image-3] ... ← kept as-is
...
User: [image-12] Final version ← kept as-is
Assistant: All issues are resolved. ← kept

The model loses the raw pixels of images 1–2, but retains its own earlier analysis of them in the assistant turns. For the vast majority of workflows this is sufficient — users
almost always care about the most recent images, not about re-examining old ones in detail.

The one edge case where this matters is if the user asks the model to re-inspect a pruned image (e.g. "Go back to screenshot 1 — what color is the button in the top-right
corner?"). The model can no longer see the original pixels and would have to rely on whatever it said before. In practice this is rare in coding-assistant workflows.

Worth noting: nothing is lost on disk — the database retains all original images. Only the copy sent to the provider is trimmed.


@iceymoss
Copy link
Copy Markdown
Contributor Author

Hi @meowgorithm @aymanbagabas @andreynering — gentle ping on this one when you get a chance 🙏

Flagging this mostly because the underlying bug makes sessions permanently
unrecoverable once the provider's image limit is hit (user has to discard the entire conversation to continue), so it's a fairly painful UX for
anyone on Gemini with image-heavy workflows.

Happy to rebase, split into smaller commits, reduce scope, or adjust the
approach if any of that would make review easier. No rush on timing — just
wanted to make sure it's on the radar.

Copy link
Copy Markdown
Member

@andreynering andreynering left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @iceymoss,

I feel that max_images (or max_attachments maybe) should actually be a model field on Catwalk.

Then, overriding should also work automatically without extra code on the Crush side (in theory).

What do you think?

@iceymoss
Copy link
Copy Markdown
Contributor Author

Hi @iceymoss,

I feel that max_images (or max_attachments maybe) should actually be a model field on Catwalk.

Then, overriding should also work automatically without extra code on the Crush side (in theory).

What do you think?

Yes, yes, I also found this problem, which will make it more confusing

@iceymoss
Copy link
Copy Markdown
Contributor Author

@andreynering Good point, I agree that max_images (or max_attachments) belongs on the Catwalk Model struct rather than being a Crush-specific config field. That's the right architectural home since it's a model-level capability, similar to how ContextWindow and SupportsImages already live there.

Here's how I see the transition:

Current state (this PR):

  • defaultMaxImages is a hard-coded map in agent.go — fragile, requires a Crush release to update limits
  • MaxImages is a field on Crush's SelectedModel config — lives in the wrong layer, duplicates what Catwalk should own
  • maxImagesForProvider() bridges the two with a priority chain

Target state (after Catwalk adds the field):

  • Catwalk Model gets a MaxAttachments int field (per model, not per provider — since different models from the same provider can have different limits)
  • Catwalk's built-in model definitions already include the value (e.g. gemini-2.5-pro: MaxAttachments: 10)
  • Users override via the existing models config in crush.json (same way they override max_tokens or context_window today) — no new field needed on Crush's SelectedModel
  • maxImagesForProvider() and the defaultMaxImages map in Crush are deleted entirely
  • pruneExcessImages() simply reads the limit from the resolved Catwalk model

What I'd like to do:

  1. Keep the pruning logic (pruneExcessImages, countImagesInMessages, isImageFilePart) in this PR — that's the actual bug fix and it's Crush-specific logic regardless of where the limit comes from
  2. Remove the MaxImages field from SelectedModel and the defaultMaxImages map
  3. File an issue on Catwalk to add MaxAttachments (or MaxImages) to the Model struct with per-model defaults
  4. Once Catwalk ships the field, wire pruneExcessImages to read from it instead

Does this approach work for you? I can update this PR now to remove the Crush-side MaxImages / defaultMaxImages and hard-code a temporary Gemini-only limit (since that's the only provider with this bug today), keeping it minimal until Catwalk provides the proper field.

…ve MaxImages config field

Remove MaxImages from SelectedModel and the defaultMaxImages provider
map. Replace with a simpler maxImagesForModel switch that hard-codes
known limits (Gemini/VertexAI: 10) with a TODO referencing the Catwalk
issue for per-model MaxAttachments.

Also enhance the prune placeholder to include the original filename so
the model retains awareness of what was removed
@andreynering
Copy link
Copy Markdown
Member

andreynering commented Apr 24, 2026

@iceymoss Yep, that sounds like a good plan. Thank you!

@iceymoss
Copy link
Copy Markdown
Contributor Author

@andreynering The changes we discussed are now pushed — I've removed the MaxImages field from SelectedModel and the defaultMaxImages map, replacing them with a minimal maxImagesForModel() switch that hard-codes the Gemini/VertexAI limit with a TODO referencing the Catwalk issue. I've also filed charmbracelet/catwalk#258 to track the upstream MaxAttachments field.

@iceymoss
Copy link
Copy Markdown
Contributor Author

iceymoss commented May 9, 2026

Hi @andreynering, gentle ping on this PR! 🏓
Just following up since I've pushed the exact refactoring we agreed upon (removed the config fields and added the temporary hardcoded limit for Gemini).
Also, just wanted to let you know that I've already opened the upstream issue (charmbracelet/catwalk#258) and submitted the corresponding PR for the permanent fix (charmbracelet/catwalk#259).
All CI checks are green here. Let me know if this temporary fix looks good to merge while we review the Catwalk PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Chat becomes unusable after exceeding image limit

3 participants