Skip to content

Commit

Permalink
Update eval_details.md
Browse files Browse the repository at this point in the history
Fixed a typo
  • Loading branch information
Vasanthrs-dev authored Jul 26, 2024
1 parent 18f515a commit c7c6c2e
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion eval_details.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ This document contains additional context on the settings and parameters for how
### Auto-eval benchmark notes
#### MMLU
- We are reporting macro averages for MMLU benchmarks. The micro average numbers for MMLU are: 65.4 and 67.4 for the 8B pre-trained and instruct-aligned models, 78.9 and 82.0 for the 70B pre-trained and instruct-aligned models
- The pre-trained models are evaluated in the standard way by calualting the likelihood of each choice character. For the instruct-aligned models, we use a dialogue prompt (*user/assistant*) for the shots and ask the model to generate the best choice character as answer.
- The pre-trained models are evaluated in the standard way by calculating the likelihood of each choice character. For the instruct-aligned models, we use a dialogue prompt (*user/assistant*) for the shots and ask the model to generate the best choice character as answer.
#### AGI English
- We use the default few-shot and prompt settings as specified [here](https://github.com/ruixiangcui/AGIEval). The score is averaged over the english subtasks.
#### CommonSenseQA
Expand Down

0 comments on commit c7c6c2e

Please sign in to comment.