Skip to content

Commit 4373fa4

Browse files
committed
finish updating gpt2- --> mistral-
1 parent f8bb469 commit 4373fa4

18 files changed

+31
-110
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ Environments and non-Python dependencies can be managed with conda, and Python d
4040

4141
#### Prerequisites
4242

43-
First, make sure to update `conf/tutorial-gpt2-micro.yaml` with the directories you want to store the Hugging Face
43+
First, make sure to update `conf/mistral-micro.yaml` with the directories you want to store the Hugging Face
4444
cache and model runs.
4545

4646
```
@@ -59,7 +59,7 @@ For single-node single-gpu training, run:
5959
```bash
6060
conda activate mistral
6161
cd mistral
62-
CUDA_VISIBLE_DEVICES=0 python train.py --config conf/tutorial-gpt2-micro.yaml --nnodes 1 --nproc_per_node 1 --training_arguments.fp16 true --training_arguments.per_device_train_batch_size 2 --run_id tutorial-gpt2-micro
62+
CUDA_VISIBLE_DEVICES=0 python train.py --config conf/mistral-micro.yaml --nnodes 1 --nproc_per_node 1 --training_arguments.fp16 true --training_arguments.per_device_train_batch_size 2 --run_id tutorial-gpt2-micro
6363
```
6464

6565
#### Multi-node multi-GPU training with DeepSpeed

conf/archive/v1/gpt2-debug-config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
inherit:
77
- datasets/openwebtext.yaml
88
- models/gpt2-small.yaml
9-
- trainers/gpt2-small-short.yaml
9+
- trainers/debug.yaml
1010

1111
# Run ID -- defaults to `null`; override as you like!
1212
run_id: null

conf/mistral-medium.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
# gpt2-mistral-small-config.yaml
2-
# Full Mistral GPT-2 Small Training Config, currently working with the OpenWebText Dataset, GPT-2 Small Architecture,
1+
# mistral-medium-config.yaml
2+
# Full Mistral GPT-2 Medium Training Config, currently working with the OpenWebText Dataset, GPT-2 Small Architecture,
33
# and full batch size (512). Runs with DeepSpeed ZeRO-2, with a per-device BSZ of 16.
44
#
55
# Inheritance and core paths can all be overridden from the command line or by re-writing these files.

conf/mistral-micro.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# tutorial-gpt2-micro.yaml
1+
# mistral2-micro.yaml
22
# Demo GPT-2 Micro Training Config, currently working with the WikiText103 Dataset, GPT-2 Micro Architecture,
33
# and batch size of 2. Runs with DeepSpeed ZeRO-2, with a per-device BSZ of 2.
44
#

conf/mistral-small.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# gpt2-mistral-small-config.yaml
1+
# mistral-small-config.yaml
22
# Full Mistral GPT-2 Small Training Config, currently working with the OpenWebText Dataset, GPT-2 Small Architecture,
33
# and full batch size (512). Runs with DeepSpeed ZeRO-2, with a per-device BSZ of 16.
44
#

conf/models/mistral-medium.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# gpt2-medium-config.yaml
1+
# mistral-medium-config.yaml
22
# Configuration for the GPT-2 Medium Model.
33
---
44
model:

conf/models/mistral-micro.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# gpt2-micro-config.yaml
1+
# mistral-micro-config.yaml
22
# Configuration for the GPT-2 Micro Model.
33
---
44
model:

conf/models/mistral-small.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# gpt2-small-config.yaml
1+
# mistral-small.yaml
22
# Configuration for the GPT-2 Small Model.
33
---
44
model:

conf/trainers/gpt2-medium.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# gpt2-small.yaml
1+
# gpt2-medium.yaml
22
# Trainer config for Full GPT-2 Medium, with the full fixed batch size of 512 (with gradient accumulation).
33
# This contract exactly follows that of HF.TrainingArguments so we can pass as a simple **kwargs -- make sure this
44
# continues to stay valid!

conf/tutorial-shakespeare-gpt2-micro.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
# Inherit Dataset, Tokenization, Model, and Training Details
88
inherit:
99
- datasets/shakespeare.yaml
10-
- models/gpt2-micro.yaml
10+
- models/mistral-micro.yaml
1111
- trainers/gpt2-small-short.yaml
1212

1313
# Run ID -- make sure to override!

0 commit comments

Comments
 (0)