You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Nice work!
The version of implementation can reach 22+ BLUE score. However, my implementation have only 0.16+ BLUE score on test dataset. Comparing with your work, I found changing the concatenation torch.cat((Y_t, o_pre), dim=1) to torch.cat((o_pre, Y_t), dim=1) can only reach 0.16+ BLUE score.
B.W.T. chaging the concatenating order between dec_hidden and a_t in step function also resulting in bad BLUE score in test dataset.
Would you like share your ideas why concatenating Y_t and o_pre in such way?
Thank you!
The text was updated successfully, but these errors were encountered:
I'm sorry for the response after so long time. It's a very interesting question and exploration.
My intuition is that the way matrics concats is same, the difference is just the order between Y_t and o_pre. In other words, you can consider the order of weights of Y_t and o_pre exchanges(of course they still exist in the same weight matrix). I didn't think so much when I finish the network because I just follow the handout. so your case confused me and I am trying to work it out now.
Nice work!
The version of implementation can reach 22+ BLUE score. However, my implementation have only 0.16+ BLUE score on test dataset. Comparing with your work, I found changing the concatenation
torch.cat((Y_t, o_pre), dim=1)
totorch.cat((o_pre, Y_t), dim=1)
can only reach 0.16+ BLUE score.B.W.T. chaging the concatenating order between
dec_hidden
anda_t
instep
function also resulting in bad BLUE score in test dataset.Would you like share your ideas why concatenating
Y_t
ando_pre
in such way?Thank you!
The text was updated successfully, but these errors were encountered: