### Student Name Giang Nguyen ### Model Length 256 ### Accuracy 55.76% ### Improvement Description Multi-round training, with checkpoint selection and optimizer reset; warm-up with Dr.GRPO. ### Detailed Write-up _No response_ ### GPU Hours _No response_ ### Submission Agreement - [x] I confirm that these results are from my own work