This repository has been archived by the owner on Dec 16, 2022. It is now read-only.
Train T5 on MATH #5054
Labels
Contributions welcome
medium
Tasks of medium difficulty.
Models
Issues related to the allennlp-models repo
We should try to train T5 on the MATH dataset (https://arxiv.org/abs/2103.03874). Performance is expected to be poor: GPT2 gets under 7%. This dataset was specifically constructed to thwart transformer models, so this is expected. This model will serve as a baseline for later attempts.
More details on the MATH dataset are here: https://github.com/hendrycks/math/
There is also a pre-training dataset available. We should try some experiments to see if we can boost performance with it.
The text was updated successfully, but these errors were encountered: