Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions examples/qwen3/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ OUTPUT_BASEPATH=${27} # 训练输出日志文件路径
```

#### 预训练示例
使用以下命令启动对qwen2的继续预训练
使用以下命令启动对qwen3的继续预训练
备注:当`AC=offload`或`full`时,可设置`MP_AC_LAYERS`环境变量来控制Checkpointing或Offload的TransformerLayer层数(默认值:`1`)。

```bash
Expand Down Expand Up @@ -254,4 +254,4 @@ accelerate launch --main_process_port 29051 -m lm_eval \
--model_args pretrained=/mnt/qwen-ckpts/Qwen3-30B-A3B-mcore-te-to-hf,trust_remote_code=True \
--tasks cmmlu,ceval-valid \
--batch_size 16
```
```