Skip to content

Fix previous mm data copy#67

Open
cdreetz wants to merge 1 commit into
mainfrom
fix/mm_copy
Open

Fix previous mm data copy#67
cdreetz wants to merge 1 commit into
mainfrom
fix/mm_copy

Conversation

@cdreetz
Copy link
Copy Markdown

@cdreetz cdreetz commented May 27, 2026

Summary:

dict(previous_multi_modal_data.mm_items) copies only the outer dictionary, but the per-modality values are still the exact same list objects, so calling .extend(...) on merged_items["image"] also mutates previous_multi_modal_data.mm_items["image"].


Note

Low Risk
Localized fix to multimodal merge logic in two renderers plus a regression test; no auth, API, or inference-path changes.

Overview
Fixes a shallow-copy bug in bridge_to_next_turn for Qwen3.5 and Qwen3-VL when merging prior-turn MultiModalData with the new turn.

Merging used dict(...) on mm_hashes, mm_placeholders, and mm_items, so per-modality lists stayed aliased to previous_multi_modal_data. Appending with .extend() mutated the caller’s prior snapshot—problematic when trainers keep per-step RenderedTokens for loss reconstruction.

The bridge now builds fresh outer dicts and copies each modality’s list (list(vals) per key) before extending with new-turn entries.

Adds test_multimodal_bridge_does_not_mutate_previous_mm_data across the multimodal matrix: prior lists unchanged after bridge, and bridged inner lists are not the same objects as the prior turn’s.

Reviewed by Cursor Bugbot for commit 2a8fba0. Bugbot is set up for automated code reviews on this repo. Configure here.

Note

Fix mutation of previous turn's multimodal data in bridge_to_next_turn

  • Replaces shallow dict(...) copies of mm_hashes, mm_placeholders, and mm_items with dict comprehensions that copy each per-modality list in qwen35.py and qwen3_vl.py.
  • Without this fix, merging new turn data would mutate the previous MultiModalData's lists due to shared list references.
  • Adds a parametrized regression test in tests/test_multimodal.py that asserts previous turn mm data is neither mutated nor aliased after bridging.

Macroscope summarized 2a8fba0.

@macroscopeapp
Copy link
Copy Markdown

macroscopeapp Bot commented May 27, 2026

Approvability

Verdict: Approved

Straightforward bug fix converting shallow copies to deep copies to prevent unintended mutation of multimodal data during turn bridging. The change is small, isolated, and includes a comprehensive regression test.

You can customize Macroscope's approvability policy. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant