feat: Implement episodic memory layer and experiment gating with LLM resolution by dishafaujdar · Pull Request #341 · karpathy/autoresearch

dishafaujdar · 2026-03-19T16:12:45Z

Key Changes

Persistent Storage (memory.py): Added SQLite read/write tracking, hooking each run to its val_bpb and confidence limits.
Entry Gating (should_run_experiment): The agent now actively gates experiment proposals. It blocks exploration on high-confidence REJECT zones while allowing refinement in ACCEPT regions or low-confidence zones.
Geometric Retrieval: Standardized hyperparameters via Z-score scaling (compute_stats/normalize) to prevent variables like batch_size=64 and lr=0.001 from distorting distances. Added cosine_similarity to perform precise nearest-neighbor threshold checks.
Orchestrating Write-Backs (record_verdict): Centralizes the logic to update past experiment confidence bounds dynamically upon new evaluation completion.
LLM Conflict Resolution (resolve_with_llm): Built a gpt-4o wrapper to act as an explicit judge when highly similar sets yield contradicting verdicts, effectively determining the new overarching ground truth based on their underlying val_bpb scores.
Test Coverage (test.py): Added an autonomous :memory: DB test validating the three-case gating logic, metric bounding, and the API fallback states.
Documentation (memory.md): Wrote a full architectural markdown briefing outlining the workflow, geometry mapping, and quick start setup.

python test.py

dishafaujdar added 5 commits March 19, 2026 00:57

memory_schema

8f67a18

implemented functions

9539b80

implemented test.py

5ac1609

added memory.md

88ee4be

updated test and readme

f9b254a