GitHub - ChicagoHAI/lm-predict-thinking-codex: Nlp research: Can LMs predict their own thinking tokens? | Generated by Idea Explorer on 2025-12-01

Project Overview

Research workspace exploring whether LMs can predict their own reasoning-token budgets (“thinking time”) on GSM8K math problems using a local Qwen2.5-0.5B-Instruct model.

Key Findings

Prediction-aware prompting produced weak calibration: MAE≈198 tokens, correlation ≈0.14 between predicted and actual reasoning tokens.
83% of cases exceeded the stated budget; some predictions were extreme (e.g., 1880 tokens).
Task accuracy was 0 on a 20-example GSM8K subset, so efficiency gains were not observable with this small model.
Visuals and metrics: see results/plots/ and results/metrics.json.

Reproduction

uv venv
source .venv/bin/activate
uv sync
python -m research_workspace.experiment

Outputs: results/metrics.json, results/raw/token_prediction_runs.parquet, plots in results/plots/.

File Structure

planning.md – research plan.
src/research_workspace/experiment.py – experiment harness (prompts, parsing, metrics, plots).
datasets/ – GSM8K JSONL files (local).
results/ – metrics, raw run data, plots.
REPORT.md – full report with analysis and next steps.

More Detail

See REPORT.md for methodology, full metrics tables, limitations, and suggested follow-ups with stronger models.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.idea-explorer		.idea-explorer
code		code
datasets		datasets
logs		logs
papers		papers
results		results
src/research_workspace		src/research_workspace
.gitignore		.gitignore
.resource_finder_complete		.resource_finder_complete
README.md		README.md
REPORT.md		REPORT.md
literature_review.md		literature_review.md
planning.md		planning.md
pyproject.toml		pyproject.toml
resources.md		resources.md
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Project Overview

Key Findings

Reproduction

File Structure

More Detail

About

Uh oh!

Releases

Packages

Languages

ChicagoHAI/lm-predict-thinking-codex

Folders and files

Latest commit

History

Repository files navigation

Project Overview

Key Findings

Reproduction

File Structure

More Detail

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages