Data

This repo expects you to place benchmark files and raw memory artifacts under data/.

Layout

Benchmark QA:
- data/atm-bench/atm-bench.json
- data/atm-bench/atm-bench-hard.json
- data/atm-bench/niah/ (NIAH pool files)
Raw personal memory (user-provided):
- data/raw_memory/image/ (raw images)
- data/raw_memory/video/ (raw videos)
- data/raw_memory/email/emails.json (optional; see schema below)
Generated artifacts:
- output/image/qwen3vl2b/batch_results.json (generated)
- output/video/qwen3vl2b/batch_results.json (generated)
- data/processed_memory/ (optional; for any future normalized memory store)
- output/ (all run outputs; always safe to delete/re-generate)

Note: data/ and output/ are gitignored.

Benchmark Data Release Status

Current status:

Paper/code release is available.
ATM-Bench dataset release is available on Hugging Face: https://huggingface.co/datasets/Jingbiao/ATM-Bench

Release channel:

Hugging Face (dataset artifacts and versioned files).

Release metadata checklist:

HF dataset link: https://huggingface.co/datasets/Jingbiao/ATM-Bench
Versioning scheme (tag/date + git commit)
sha256 checksums for released files
Minimal download instructions (curl / huggingface_hub)
License + citation block

Schemas (What Scripts Expect)

QA JSON (`atm-bench*.json`)

The QA files are JSON arrays (or a dict with a qas list) of entries containing:

id (string)
question (string)
answer (string)
evidence_ids (list of strings; ground-truth evidence IDs)

For NIAH pool files, each entry additionally contains:

niah_evidence_ids (list of strings; fixed evidence pool, superset of evidence_ids)

Email File (`data/raw_memory/email/emails.json`)

If a QA item includes email evidence IDs (IDs starting with email...), Oracle/MMRAG scripts may load email evidence from a JSON list with entries like:

[
  {
    "id": "email202401010001",
    "timestamp": "2024-01-01 12:34:56",
    "short_summary": "One-line summary",
    "detail": "Longer email content or extracted body"
  }
]

If your released benchmark does not include emails, you can omit this file.

Batch Results (`batch_results.json`)

Text-only evidence for images/videos is read from batch_results.json files. Scripts index entries by Path(image_path).stem / Path(video_path).stem.

Each entry typically contains:

image_path / video_path (string path; used to derive the evidence ID stem)
timestamp (string)
location_name (string)
short_caption (string)
caption (string)
ocr_text (string)
tags (list of strings)

GPS / Reverse-Geocoding Cache (location_name)

location_name is derived from GPS coordinates via reverse geocoding (default: OpenStreetMap Nominatim). Public geocoding endpoints are rate-limited (often strict per-IP requests/minute) and can become a bottleneck for large archives, especially if you run the processors with high concurrency.

The processors cache reverse-geocoding results as JSON files under <output_dir>/cache/:

<media_filename_stem>_location_name.json

The cache key is the media filename stem, not the file contents hash. For example:

20220430_132212_location_name.json
20220502_172850_location_name.json

If you have a pre-extracted GPS cache bundle, place it under:

data/raw_memory/geocoding_cache/image
data/raw_memory/geocoding_cache/video

Then copy those cache files into your processor cache directory before running the processors so geocoding calls are skipped. The bundle must match the current image/video filenames:

python memqa/utils/copy_gps_info.py data/raw_memory/geocoding_cache/image output/image/qwen3vl2b/cache
python memqa/utils/copy_gps_info.py data/raw_memory/geocoding_cache/video output/video/qwen3vl2b/cache

You can also use the convenience wrappers:

scripts/memory_processor/image/copy_gps_cache.sh
scripts/memory_processor/video/copy_gps_cache.sh

Building NIAH Pools

If you have an MMRAG run that produced retrieval_recall_details.json, you can build/validate NIAH pools via:

python scripts/QA_Agent/NIAH/build_niah_pools.py \
  --qa-file data/atm-bench/atm-bench-hard.json \
  --retrieval-details <PATH_TO>/retrieval_recall_details.json \
  --pool-sizes 25 50 100 200

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data

Layout

Benchmark Data Release Status

Schemas (What Scripts Expect)

QA JSON (`atm-bench*.json`)

Email File (`data/raw_memory/email/emails.json`)

Batch Results (`batch_results.json`)

GPS / Reverse-Geocoding Cache (location_name)

Building NIAH Pools

FilesExpand file tree

data.md

Latest commit

History

data.md

File metadata and controls

Data

Layout

Benchmark Data Release Status

Schemas (What Scripts Expect)

QA JSON (atm-bench*.json)

Email File (data/raw_memory/email/emails.json)

Batch Results (batch_results.json)

GPS / Reverse-Geocoding Cache (location_name)

Building NIAH Pools

QA JSON (`atm-bench*.json`)

Email File (`data/raw_memory/email/emails.json`)

Batch Results (`batch_results.json`)