When I only use 10,000 training data, the loss function of the test dataset of each network has been rising instead of falling, which will lead to very serious overfitting.Training accuracy rate is 100%, test accuracy rate is 10%. How can I solve this problem?