Description
Hello,
I was trying to to use pre-trained model for the embedding but there are two bugs.
The first one in the sample.py, with init the parameters. I think we shouldn't init the embedding.
for param in seq2seq.parameters(): param.data.uniform_(-0.08, 0.08)
I recommend to be like that
for param in [p for p in seq2seq.parameters() if p.requires_grad == True]: param.data.uniform_(-0.08, 0.08)
The second bug is with the optimization in supervised_trainer.py, that the optimization will throw an error when we will try to optimize the embedding.
optimizer = Optimizer(optim.Adam(model.parameters()), max_grad_norm=5)
I will recommend to be like that.
optimizer = Optimizer(optim.Adam(filter(lambda p: p.requires_grad, model.parameters())), max_grad_norm=5)
I have a question, do you thing it is important to have the pre-trained embedding in the decoder either or just in the encoder will be enough ?