How much memory is required for the training phase of this model? What equipment did you use for the experiment and how long did it take?