-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Even more CUSPARSE tests #2682
Even more CUSPARSE tests #2682
Conversation
cd56e43
to
da59439
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #2682 +/- ##
==========================================
+ Coverage 82.08% 82.59% +0.50%
==========================================
Files 154 153 -1
Lines 13661 13606 -55
==========================================
+ Hits 11214 11238 +24
+ Misses 2447 2368 -79 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CUDA.jl Benchmarks
Benchmark suite | Current: 49edc86 | Previous: 2540087 | Ratio |
---|---|---|---|
latency/precompile |
46191914324 ns |
46159355585.5 ns |
1.00 |
latency/ttfp |
7048535341 ns |
6957794099 ns |
1.01 |
latency/import |
3680835625 ns |
3631949691 ns |
1.01 |
integration/volumerhs |
9622492 ns |
9611328 ns |
1.00 |
integration/byval/slices=1 |
147029 ns |
146783 ns |
1.00 |
integration/byval/slices=3 |
425204 ns |
425400 ns |
1.00 |
integration/byval/reference |
144986 ns |
145027 ns |
1.00 |
integration/byval/slices=2 |
286029 ns |
286289 ns |
1.00 |
integration/cudadevrt |
103432 ns |
103478 ns |
1.00 |
kernel/indexing |
14059 ns |
14116.5 ns |
1.00 |
kernel/indexing_checked |
14809 ns |
14717 ns |
1.01 |
kernel/occupancy |
635.5411764705882 ns |
631.1345029239766 ns |
1.01 |
kernel/launch |
2024.1 ns |
2048.6 ns |
0.99 |
kernel/rand |
17910 ns |
15306 ns |
1.17 |
array/reverse/1d |
19466 ns |
19766 ns |
0.98 |
array/reverse/2d |
23038 ns |
23320 ns |
0.99 |
array/reverse/1d_inplace |
10000 ns |
10115 ns |
0.99 |
array/reverse/2d_inplace |
11642 ns |
11769 ns |
0.99 |
array/copy |
20904 ns |
21157 ns |
0.99 |
array/iteration/findall/int |
157174 ns |
159462 ns |
0.99 |
array/iteration/findall/bool |
138272.5 ns |
139925 ns |
0.99 |
array/iteration/findfirst/int |
152707.5 ns |
154378.5 ns |
0.99 |
array/iteration/findfirst/bool |
154040 ns |
155307 ns |
0.99 |
array/iteration/scalar |
72483 ns |
74069 ns |
0.98 |
array/iteration/logical |
212005 ns |
217157 ns |
0.98 |
array/iteration/findmin/1d |
40701 ns |
41948 ns |
0.97 |
array/iteration/findmin/2d |
93323 ns |
94228.5 ns |
0.99 |
array/reductions/reduce/1d |
35696 ns |
36315 ns |
0.98 |
array/reductions/reduce/2d |
40565 ns |
40828 ns |
0.99 |
array/reductions/mapreduce/1d |
33024 ns |
33656 ns |
0.98 |
array/reductions/mapreduce/2d |
40709 ns |
51092 ns |
0.80 |
array/broadcast |
20641 ns |
21055 ns |
0.98 |
array/copyto!/gpu_to_gpu |
13692 ns |
11958 ns |
1.15 |
array/copyto!/cpu_to_gpu |
208318 ns |
211255 ns |
0.99 |
array/copyto!/gpu_to_cpu |
243872 ns |
244927 ns |
1.00 |
array/accumulate/1d |
108093 ns |
109111 ns |
0.99 |
array/accumulate/2d |
79684 ns |
80441 ns |
0.99 |
array/construct |
1256.6 ns |
1271.2 ns |
0.99 |
array/random/randn/Float32 |
42847.5 ns |
47305 ns |
0.91 |
array/random/randn!/Float32 |
26167 ns |
26683 ns |
0.98 |
array/random/rand!/Int64 |
27091 ns |
27072 ns |
1.00 |
array/random/rand!/Float32 |
8865.666666666666 ns |
8803.333333333334 ns |
1.01 |
array/random/rand/Int64 |
29636 ns |
29816 ns |
0.99 |
array/random/rand/Float32 |
12962 ns |
13176 ns |
0.98 |
array/permutedims/4d |
61115 ns |
61379.5 ns |
1.00 |
array/permutedims/2d |
55102.5 ns |
55887 ns |
0.99 |
array/permutedims/3d |
55745.5 ns |
56426.5 ns |
0.99 |
array/sorting/1d |
2775958 ns |
2766596.5 ns |
1.00 |
array/sorting/by |
3366877.5 ns |
3370290 ns |
1.00 |
array/sorting/2d |
1084115 ns |
1085614.5 ns |
1.00 |
cuda/synchronization/stream/auto |
1024 ns |
1027.9 ns |
1.00 |
cuda/synchronization/stream/nonblocking |
6480.8 ns |
6419.6 ns |
1.01 |
cuda/synchronization/stream/blocking |
802.2083333333334 ns |
800.59 ns |
1.00 |
cuda/synchronization/context/auto |
1146.2 ns |
1152.4 ns |
0.99 |
cuda/synchronization/context/nonblocking |
6601 ns |
6599.2 ns |
1.00 |
cuda/synchronization/context/blocking |
893.3846153846154 ns |
899.7446808510638 ns |
0.99 |
This comment was automatically generated by workflow using github-action-benchmark.
Your PR requires formatting changes to meet the project's style guidelines. Click here to view the suggested changes.diff --git a/lib/cusparse/array.jl b/lib/cusparse/array.jl
index 2f411eebd..3c598b66b 100644
--- a/lib/cusparse/array.jl
+++ b/lib/cusparse/array.jl
@@ -490,7 +490,8 @@ CuSparseVector{T}(Mat::SparseMatrixCSC) where {T} =
throw(ArgumentError("The input argument must have a single column"))
CuSparseMatrixCSC{T}(Vec::SparseVector) where {T} =
CuSparseMatrixCSC{T}(CuVector{Cint}([1]), CuVector{Cint}(Vec.nzind),
- CuVector{T}(Vec.nzval), (length(Vec), 1))
+ CuVector{T}(Vec.nzval), (length(Vec), 1)
+)
CuSparseMatrixCSC{T}(Mat::SparseMatrixCSC) where {T} =
CuSparseMatrixCSC{T}(CuVector{Cint}(Mat.colptr), CuVector{Cint}(Mat.rowval),
CuVector{T}(Mat.nzval), size(Mat))
diff --git a/test/libraries/cusparse.jl b/test/libraries/cusparse.jl
index 843d49980..c0c4e878c 100644
--- a/test/libraries/cusparse.jl
+++ b/test/libraries/cusparse.jl
@@ -22,7 +22,7 @@ blockdim = 5
@test ndims(d_x) == 1
dense_d_x = CuVector(x)
CUDA.@allowscalar begin
- @test sprint(show, d_x) == replace(sprint(show, x), "SparseVector{Float64, Int64}"=>"CUDA.CUSPARSE.CuSparseVector{Float64, Int32}", "sparsevec(["=>"sparsevec(Int32[")
+ @test sprint(show, d_x) == replace(sprint(show, x), "SparseVector{Float64, Int64}" => "CUDA.CUSPARSE.CuSparseVector{Float64, Int32}", "sparsevec([" => "sparsevec(Int32[")
@test Array(d_x[:]) == x[:]
@test d_x[firstindex(d_x)] == x[firstindex(x)]
@test d_x[div(end, 2)] == x[div(end, 2)]
@@ -39,7 +39,7 @@ blockdim = 5
@test nnz(d_x) == length(nonzeros(d_x))
d_y = copy(d_x)
CUDA.unsafe_free!(d_y)
- x = sprand(m,0.2)
+ x = sprand(m, 0.2)
d_x = CuSparseMatrixCSC{Float64}(x)
@test size(d_x) == (m, 1)
x = sprand(m,n,0.2) |
CI failure looks related. |
94a654e
to
49edc86
Compare
We should turn off coverage for the device-side file. Added some simple tests for
unsafe_free!
, happy to improve them if there's a better idea.