Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
frankaging authored Apr 12, 2024
1 parent 9b95453 commit 528d504
Showing 1 changed file with 4 additions and 3 deletions.
7 changes: 4 additions & 3 deletions examples/loreft/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,13 @@ This directory contains all the files needed to reproduce our paper results. We

## Datasets

To load the datasets run:
To load all of our used datasets run:

```bash
bash load_datasets.sh
```

We copy everything from [LLM-Adapters](https://github.com/AGI-Edgerunners/LLM-Adapters/tree/main) for the dataset setup. Specifically, we get:
We copy everything from [LLM-Adapters](https://github.com/AGI-Edgerunners/LLM-Adapters/tree/main) for the commonsense and math reasoning dataset setup. We use a parsed version of [Ultrafeedback dataset](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned) for instruct-tuning. Specifically, we get:

- Training data for commonsense and math reasoning:
- [`commonsense_170k.json`](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/ft-training_set/commonsense_170k.json)
Expand All @@ -21,7 +21,8 @@ We copy everything from [LLM-Adapters](https://github.com/AGI-Edgerunners/LLM-Ad
- Evaluation data for commonsense and math reasoning are included in:
- [`LLM-Adapters/dataset`](https://github.com/AGI-Edgerunners/LLM-Adapters/tree/main/dataset)

- For instrution following training and evaluation, everything is done through HuggingFace hub. Note that we did not create our own dataset, instead we took previous ones to ensure a fair comparison.
- For instrution following training:
- [`train.json`](https://github.com/frankaging/ultrafeedback-dataset/blob/main/train.json)

## Commonsense reasoning tasks

Expand Down

0 comments on commit 528d504

Please sign in to comment.