Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve vit int8 mha opr #4096

Closed
wants to merge 18 commits into from
Closed

Conversation

tpoisonooo
Copy link
Contributor

@tpoisonooo tpoisonooo commented Jul 28, 2022

这个 PR 写了 mha 的 x86 简单优化:

  • avx512f 指令集, ViT 从 670 -> 645ms
  • avx2 指令集,ViT int8 从 1662 -> 1240ms

现在整个网络是 GEMM 比较慢

codebase 是 PR 3940 rebase 了一下 master,啥时候能 review 啊,感觉 diff 越来越大...

@tpoisonooo tpoisonooo changed the title WIP: improve vit WIP: improve vit int8 Jul 28, 2022
@lgtm-com
Copy link

lgtm-com bot commented Jul 28, 2022

This pull request introduces 4 alerts when merging e0a0ca6 into b4ba207 - view on LGTM.com

new alerts:

  • 4 for Short global name

@codecov-commenter
Copy link

codecov-commenter commented Aug 1, 2022

Codecov Report

Merging #4096 (9454c51) into master (b4ba207) will decrease coverage by 2.51%.
The diff coverage is 99.21%.

❗ Current head 9454c51 differs from pull request most recent head 42ad426. Consider uploading reports for the commit 42ad426 to get more accurate results

@@            Coverage Diff             @@
##           master    #4096      +/-   ##
==========================================
- Coverage   94.42%   91.91%   -2.52%     
==========================================
  Files         747      557     -190     
  Lines      178769   116257   -62512     
==========================================
- Hits       168811   106860   -61951     
+ Misses       9958     9397     -561     
Impacted Files Coverage Δ
src/layer/x86/softmax_x86.cpp 98.13% <ø> (-0.08%) ⬇️
src/layer/x86/multiheadattention_x86.cpp 98.89% <98.89%> (ø)
src/layer/multiheadattention.cpp 98.91% <99.40%> (+5.68%) ⬆️
src/layer/x86/x86_usability.h 86.86% <100.00%> (-13.14%) ⬇️
src/layer/arm/convolution_winograd_transform.h 0.00% <0.00%> (-100.00%) ⬇️
...c/layer/arm/convolution_winograd_transform_bf16s.h 0.00% <0.00%> (-98.31%) ⬇️
src/layer/arm/flatten_arm.cpp 35.74% <0.00%> (-63.46%) ⬇️
src/layer/x86/convolution_3x3_pack1to4.h 49.13% <0.00%> (-50.87%) ⬇️
src/layer/arm/packing_arm.cpp 64.53% <0.00%> (-31.76%) ⬇️
src/layer/x86/quantize_x86.cpp 68.67% <0.00%> (-28.26%) ⬇️
... and 474 more

Help us with your feedback. Take ten seconds to tell us how you rate us.

@lgtm-com
Copy link

lgtm-com bot commented Aug 2, 2022

This pull request introduces 1 alert when merging 49cbb14 into 00c08d7 - view on LGTM.com

new alerts:

  • 1 for FIXME comment

@lgtm-com
Copy link

lgtm-com bot commented Aug 3, 2022

This pull request introduces 1 alert when merging 9c1c2c9 into 00c08d7 - view on LGTM.com

new alerts:

  • 1 for FIXME comment

@lgtm-com
Copy link

lgtm-com bot commented Aug 4, 2022

This pull request introduces 1 alert when merging 42ad426 into 00c08d7 - view on LGTM.com

new alerts:

  • 1 for FIXME comment

@tpoisonooo tpoisonooo changed the title WIP: improve vit int8 improve vit int8 mha opr Aug 5, 2022
@tpoisonooo tpoisonooo closed this Aug 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants