A speculative attempt at reverse-engineering OpenAI's "Strawberry" (now o1) before its public release. Built on September 11, 2024 — one day before o1-preview was announced.
The idea: if you make an LLM spend more compute at inference time by critiquing and refining its own outputs, you get better answers. This implements a generate → critique → grade → refine loop using Llama 3.1 70B on AWS Bedrock.
- Generate 3 independent candidate answers to a prompt
- Critique each answer using the same LLM
- Grade each on a -100 to 100 scale based on the critique
- Select the best, refine it
- Repeat for N iterations
Requires boto3 and AWS CLI credentials configured.
python3 redfruit.pyConvert the output to readable HTML:
python3 htmlize.py