fix: memory not released after indexing (20GB+ RSS for 5MB data)#831
Closed
fxfxfx123 wants to merge 3 commits into
Closed
fix: memory not released after indexing (20GB+ RSS for 5MB data)#831fxfxfx123 wants to merge 3 commits into
fxfxfx123 wants to merge 3 commits into
Conversation
- Enable mi_option_abandoned_thread_purge so mimalloc returns memory from exited worker thread arenas to the OS. Without this, 20 worker threads each retain 100-500MB after indexing completes. - Lower DEFAULT_RAM_FRACTION from 0.5 to 0.25 to reduce memory pressure threshold on systems with 32GB+ RAM.
Memory scales linearly with worker count (each gets its own mimalloc arena + 8MB stack). Diminishing returns past 8 workers. On a 20-core CPU this reduces peak memory by up to 60% with negligible speed loss.
Replace mi_option_abandoned_thread_purge (not available in vendored mimalloc version) with two equivalent options: - mi_option_arena_purge_mult=1 (default 10): purge arenas with no extra delay beyond purge_delay - mi_option_page_reclaim_on_free=1: reclaim pages from abandoned worker thread heaps on free This fixes the CI build error.
Author
|
Fixed CI build error: Replaced with two equivalent options that exist in the vendored version: mi_option_set(mi_option_arena_purge_mult, 1); // default 10 → purge arenas aggressively
mi_option_set(mi_option_page_reclaim_on_free, 1); // reclaim pages from abandoned worker threadsBoth options achieve the same goal: when worker threads exit after indexing, their mimalloc arenas are purged and memory is returned to the OS. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
After indexing a small project (65 files, 1.3MB, 2509 nodes), codebase-memory-mcp retains 20GB+ RSS on a 32GB Windows machine. Memory grows monotonically and is never released to the OS.
Root Cause (confirmed from source code)
1. mimalloc abandoned-thread arenas not purged
mem.c sets mi_option_purge_decommits and mi_option_purge_delay but does NOT set mi_option_abandoned_thread_purge. When 20 worker threads are created for parallel indexing, each gets its own mimalloc arena. After workers complete and are joined, their arenas become abandoned but are not purged. Memory stays in the process working set indefinitely.
20 workers x 100-500MB per arena = 2-10GB retained.
2. Default worker count too high
system_info.c returns info.total_cores (20 on i7-12700) for initial indexing. Each worker gets an 8MB stack + its own mimalloc arena. Memory scales linearly with worker count with diminishing returns past 8.
3. RAM fraction too high
DEFAULT_RAM_FRACTION is 0.5 (16GB budget on 32GB machine), so mimalloc feels no pressure to release until 16GB is consumed.
Fix
This PR makes three changes:
Benchmark
Tested on Windows 11, i7-12700, 32GB RAM, 65-file project.
Workaround (no rebuild needed)
Set CBM_WORKERS=1 in MCP env and run: codebase-memory-mcp config set auto_index false