some issues for MMLU benchmark when loading local dataset #2469

1436033631 · 2024-11-08T03:17:05Z

Hi
I have some issues for MMLU benchmark when loading localdataset. Could you help to take a look?
1: python3.12 + lm-evaluation-harness-0.4.5
2. run command

lm_eval --model hf --model_args pretrained=${local_model_path} --tasks mmlu_human_aging ........

3: yaml: I have changed the dataset_path and dataset_kwargs to load local dataset

dataset_path: {PRIVATE_LOCAL_PATH}/hails/mmlu_no_train/ # a copy of `cais/mmlu` with no auxiliary_train split
test_split: test
fewshot_split: dev
fewshot_config:
  sampler: first_n
output_type: multiple_choice
doc_to_text: "{{question.strip()}}\nA. {{choices[0]}}\nB. {{choices[1]}}\nC. {{choices[2]}}\nD. {{choices[3]}}\nAnswer:"
doc_to_choice: ["A", "B", "C", "D"]
doc_to_target: answer
metric_list:
  - metric: acc
    aggregation: mean
    higher_is_better: true
metadata:
  version: 1.0
dataset_kwargs:
  data_dir: all

4: error with dataset name not found

ValueError: BuilderConfig 'human_aging' not found. Available: ['default']

I also tried to modify some code as below to bypass this check, but will encounter another error

lm_eval/api/task.py
@@ -922,11 +922,14 @@ class ConfigurableTask(Task):
                     )

     def download(self, dataset_kwargs: Optional[Dict[str, Any]] = None) -> None:
         self.dataset = datasets.load_dataset(
             path=self.DATASET_PATH,
-            name=self.DATASET_NAME,
+            # name=self.DATASET_NAME,
             **dataset_kwargs if dataset_kwargs is not None else {},
         )

another error:

KeyError: 'dev'
```

The text was updated successfully, but these errors were encountered:

1436033631 · 2024-11-12T02:19:44Z

This issue was resolved by using the online datasets, let's close this issue now, thanks.

1436033631 closed this as completed Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

some issues for MMLU benchmark when loading local dataset #2469

some issues for MMLU benchmark when loading local dataset #2469

1436033631 commented Nov 8, 2024

1436033631 commented Nov 12, 2024

some issues for MMLU benchmark when loading local dataset #2469

some issues for MMLU benchmark when loading local dataset #2469

Comments

1436033631 commented Nov 8, 2024

1436033631 commented Nov 12, 2024