Short study of dialogue-only vs belief-conditioned LLM agents on AIWolf 5-player logs to see if explicit opponent modeling helps identify the Werewolf. Using a small local model (Qwen2.5-0.5B-Instruct), dialogue-only outperformed the heuristic belief prior on an 8-game sample.
- Dialogue-only hit rate: 0.50 (4/8); belief-conditioned: 0.38 (3/8).
- Belief priors based purely on accusation/divination counts never corrected a dialogue-only miss.
- Heuristic priors likely overweight early mass accusations; richer belief calibration needed.
- Create environment (already configured here):
uv venv && source .venv/bin/activate. - Install deps:
uv sync(usespyproject.toml). - Run experiment:
python -m research_workspace.experiment(downloads Qwen2.5-0.5B if no API keys; uses GPT-4.1 ifOPENAI_API_KEY/OPENROUTER_API_KEYset). - Outputs:
- Metrics:
results/metrics.json - Raw generations:
results/llm_outputs.json - Plot:
results/plots/accuracy_bar.png
- Metrics:
planning.md– research plansrc/research_workspace/experiment.py– data parsing, prompting, evaluationdatasets/– AIWolf logs + READMEresults/– experiment outputs and plotsREPORT.md– full report with methodology and analysis
See REPORT.md for detailed methods, limitations, and next steps.