This repository contains the data and code for the following paper:
APA style reference.
Please cite the paper if you use the resources in the repository.
(The code is adapted from Vig, J. (2019). A multiscale visualization of attention in the transformer model. arXiv preprint arXiv:1906.05714.)
torch 1.31.1
transformsers 4.27.2