-In practice, I would say that the default value is k=10 in k-fold CV, which is typically a good choice. But if we are working with small training sets, I would increase the number of folds to use more training data in each iteration; this will lower the bias towards estimating the generalization error. On the other hand, it will also increase the run-time though. So, if we are training deep neural nets on large datasets, and if we want to tune hyperparameters, I would think carefully about the size of *k*. If our dataset is large, it is typically okay to choose smaller *k* since you will still get good average performance estimates. And for our final estimate, we still have our independent test set anyway.
0 commit comments