-
Notifications
You must be signed in to change notification settings - Fork 538
[Executorch][llm] Enable local global attention in export_llama script #10836
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10836
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 13 New FailuresAs of commit 7ee018b with merge base ef30b25 ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
ed32169
to
084bfe2
Compare
Pull Request resolved: #10612 Added a new option of --local_global_attention that takes in pattern of sizes to determine which layers are using local sliding window attention. For example, [0, 256, 256, 0, 256, 256] can be used for 6 layers transformer. Or you can also use [0, 256, 256] as pattern you want to repeat. ghstack-source-id: 283404674 @exported-using-ghexport Differential Revision: [D73891423](https://our.internmc.facebook.com/intern/diff/D73891423/)
7fddb24
to
7ee018b
Compare
This PR needs a
|
This PR was created by the merge bot to help merge the original PR into the main branch.
ghstack PR number: #10612 by @kimishpatel
^ Please use this as the source of truth for the PR details, comments, and reviews
ghstack PR base: https://github.com/pytorch/executorch/tree/gh/kimishpatel/189/base
ghstack PR head: https://github.com/pytorch/executorch/tree/gh/kimishpatel/189/head
Merge bot PR base: https://github.com/pytorch/executorch/tree/gh/kimishpatel/188/orig
Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/kimishpatel/189/orig
@diff-train-skip-merge