- Provide the actually used training scheme for Aquila2-70B-Expr, including the parallel strategies, optimizations and hyper-parameter settings.
- Support heterogeneous training on chips of different generations with the same architecture or compatible architectures, including NVIDIA GPUs and Iluvatar CoreX chips.
- Support training on chinese domestic hardwares, including Iluvatar CoreX and Baidu KUNLUN chips.