-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix loading on unsupported platforms #459
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Metal Benchmarks
Benchmark suite | Current: a3ac9d1 | Previous: ddca5c4 | Ratio |
---|---|---|---|
private array/construct |
26878.5 ns |
26722.25 ns |
1.01 |
private array/broadcast |
463166 ns |
467500 ns |
0.99 |
private array/random/randn/Float32 |
770375 ns |
751250 ns |
1.03 |
private array/random/randn!/Float32 |
698833 ns |
636125 ns |
1.10 |
private array/random/rand!/Int64 |
568458 ns |
578875 ns |
0.98 |
private array/random/rand!/Float32 |
602000 ns |
589417 ns |
1.02 |
private array/random/rand/Int64 |
771708 ns |
813250 ns |
0.95 |
private array/random/rand/Float32 |
621375 ns |
542500 ns |
1.15 |
private array/copyto!/gpu_to_gpu |
658895.5 ns |
515812.5 ns |
1.28 |
private array/copyto!/cpu_to_gpu |
822875.5 ns |
727666 ns |
1.13 |
private array/copyto!/gpu_to_cpu |
693916 ns |
554625 ns |
1.25 |
private array/accumulate/1d |
1330854.5 ns |
1444334 ns |
0.92 |
private array/accumulate/2d |
1408458 ns |
1469333 ns |
0.96 |
private array/iteration/findall/int |
2077833 ns |
2243208.5 ns |
0.93 |
private array/iteration/findall/bool |
1856416 ns |
2115125 ns |
0.88 |
private array/iteration/findfirst/int |
1662875 ns |
1690562.5 ns |
0.98 |
private array/iteration/findfirst/bool |
1658666 ns |
1674250 ns |
0.99 |
private array/iteration/scalar |
3934708 ns |
2412750 ns |
1.63 |
private array/iteration/logical |
3231791.5 ns |
3443541 ns |
0.94 |
private array/iteration/findmin/1d |
1569958 ns |
1771104 ns |
0.89 |
private array/iteration/findmin/2d |
1322791.5 ns |
1362584 ns |
0.97 |
private array/reductions/reduce/1d |
1051104 ns |
793250 ns |
1.33 |
private array/reductions/reduce/2d |
692917 ns |
713896 ns |
0.97 |
private array/reductions/mapreduce/1d |
1064875 ns |
864750 ns |
1.23 |
private array/reductions/mapreduce/2d |
692125 ns |
712500 ns |
0.97 |
private array/permutedims/4d |
841916 ns |
912916 ns |
0.92 |
private array/permutedims/2d |
933333 ns |
931604.5 ns |
1.00 |
private array/permutedims/3d |
914749.5 ns |
1000937.5 ns |
0.91 |
private array/copy |
512583 ns |
509083 ns |
1.01 |
latency/precompile |
4389223125 ns |
4397344834 ns |
1.00 |
latency/ttfp |
6674100542 ns |
6908319354 ns |
0.97 |
latency/import |
720130042 ns |
719607250 ns |
1.00 |
integration/metaldevrt |
739167 ns |
756958 ns |
0.98 |
integration/byval/slices=1 |
1598250 ns |
1632792 ns |
0.98 |
integration/byval/slices=3 |
8447604 ns |
8919542 ns |
0.95 |
integration/byval/reference |
1628791 ns |
1659354.5 ns |
0.98 |
integration/byval/slices=2 |
2604958.5 ns |
2684500 ns |
0.97 |
kernel/indexing |
453145.5 ns |
478708 ns |
0.95 |
kernel/indexing_checked |
460354.5 ns |
476750 ns |
0.97 |
kernel/launch |
8500 ns |
12084 ns |
0.70 |
metal/synchronization/stream |
14750 ns |
19916 ns |
0.74 |
metal/synchronization/context |
14916 ns |
20209 ns |
0.74 |
shared array/construct |
25145.833333333332 ns |
24899.25 ns |
1.01 |
shared array/broadcast |
464229.5 ns |
470042 ns |
0.99 |
shared array/random/randn/Float32 |
866125 ns |
754416 ns |
1.15 |
shared array/random/randn!/Float32 |
668562.5 ns |
650708 ns |
1.03 |
shared array/random/rand!/Int64 |
567458 ns |
586083.5 ns |
0.97 |
shared array/random/rand!/Float32 |
594250 ns |
595750 ns |
1.00 |
shared array/random/rand/Int64 |
798458 ns |
809708.5 ns |
0.99 |
shared array/random/rand/Float32 |
616562.5 ns |
589562.5 ns |
1.05 |
shared array/copyto!/gpu_to_gpu |
87625 ns |
93459 ns |
0.94 |
shared array/copyto!/cpu_to_gpu |
87125 ns |
95291 ns |
0.91 |
shared array/copyto!/gpu_to_cpu |
83625 ns |
84042 ns |
1.00 |
shared array/accumulate/1d |
1351500 ns |
1440417 ns |
0.94 |
shared array/accumulate/2d |
1403583 ns |
1482250 ns |
0.95 |
shared array/iteration/findall/int |
1838084 ns |
1940666.5 ns |
0.95 |
shared array/iteration/findall/bool |
1611895.5 ns |
1770833 ns |
0.91 |
shared array/iteration/findfirst/int |
1377875 ns |
1395458 ns |
0.99 |
shared array/iteration/findfirst/bool |
1362541 ns |
1371084 ns |
0.99 |
shared array/iteration/scalar |
156000 ns |
195250 ns |
0.80 |
shared array/iteration/logical |
3048604 ns |
3219395.5 ns |
0.95 |
shared array/iteration/findmin/1d |
1258541 ns |
1497458 ns |
0.84 |
shared array/iteration/findmin/2d |
1326292 ns |
1371541.5 ns |
0.97 |
shared array/reductions/reduce/1d |
737291.5 ns |
638333.5 ns |
1.16 |
shared array/reductions/reduce/2d |
689833 ns |
718750 ns |
0.96 |
shared array/reductions/mapreduce/1d |
751312.5 ns |
708500 ns |
1.06 |
shared array/reductions/mapreduce/2d |
700083 ns |
707208.5 ns |
0.99 |
shared array/permutedims/4d |
848958 ns |
952146 ns |
0.89 |
shared array/permutedims/2d |
854625 ns |
929292 ns |
0.92 |
shared array/permutedims/3d |
912542 ns |
1039791 ns |
0.88 |
shared array/copy |
249375 ns |
288396 ns |
0.86 |
This comment was automatically generated by workflow using github-action-benchmark.
All the following runs should pass when the two issues #457 and #458 are resolved, merged and released: I will surely test with your branch (once you have finished your discussions with @maleadt on the approach to take) |
@omlins You can go ahead and try out this branch. Once you confirm that it works, I'll merge this and backport to 1.4 |
I'm sorry, I just realized that you would need me to test it on windows. :) Unfortunately, I don't have a windows machine. Given that it works for you, I assume you can be confident enough to release it. Then, I can confirm you that it works in the CI before you do the backport. |
* Gate getpagesize call behind function * Make `functional` require a fully-supported GPU
@omlins You can test directly on the |
@christiangnrd : the CI has now passed as you have probably seen. So I can also confirm from my side that the fix worked. |
Closes #457
Closes #458
Backport to 1.4?