Open
Description
The torch.nn.modules.transformer
documentation says the word_language_model
example in this repo is an example of its use. But it seems to instead DIY a transformer and uses that instead. Is this intentional? I would offer my help to write it for torch.nn.modules.transformer
but I'm here to learn how to use it.