-
Notifications
You must be signed in to change notification settings - Fork 13.5k
vulkan: Use spec constants for conv2d s/d/p and kernel W/H #16978
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
What's with that auroralabs-loci bot repeatedly mirroring all our PRs? |
|
That whole account seems to be managed by bots so I guess it's malfunctioning? Here there are the performance numbers on my AMD gpus: |
Also add some additional unroll hints, which seems to help.
8267cc2 to
ca455a3
Compare
|
Changed the outer loop to |
|
I was curious how this affects compilation and run time, so I compared some vision models (RTX 4070, CM2): Before:
After:
The compile time probably fluctuates quite a bit, and caching works well anyway. The mixed Transformer+Conv2D models didn't really improve (for various reasons I suspect). ESRGAN is pure Conv2D and it shows. My takeaway is that spec-const-all-the-things is probably good, or at least not bad :) |
Also add some additional unrolling, which seems to help.