Riscv64 c906 d1 #3177

yaobyPerfxlab · 2021-08-16T08:28:03Z

No description provided.

RVV_SPEC_0_7 update

replace vfmacc_vf with vfmacc_vv

codecov-commenter · 2021-08-16T08:50:33Z

Codecov Report

Merging #3177 (a926e61) into master (169614f) will increase coverage by 0.43%.
The diff coverage is 100.00%.

❗ Current head a926e61 differs from pull request most recent head 7cdbce0. Consider uploading reports for the commit 7cdbce0 to get more accurate results

@@            Coverage Diff             @@
##           master    #3177      +/-   ##
==========================================
+ Coverage   90.34%   90.77%   +0.43%     
==========================================
  Files         510      465      -45     
  Lines      136094   108462   -27632     
==========================================
- Hits       122950    98458   -24492     
+ Misses      13144    10004    -3140

Impacted Files	Coverage Δ
src/layer/riscv/convolution_sgemm_packn_fp16s.h	`98.65% <100.00%> (+0.04%)`	⬆️
src/layer/riscv/flatten_riscv.cpp	`89.91% <0.00%> (-4.83%)`	⬇️
src/layer/crop.cpp	`79.17% <0.00%> (-3.79%)`	⬇️
src/layer/x86/pooling_x86.cpp	`92.15% <0.00%> (-3.30%)`	⬇️
src/layer/x86/relu_x86.cpp	`74.19% <0.00%> (-3.17%)`	⬇️
src/allocator.cpp	`73.98% <0.00%> (-2.78%)`	⬇️
src/layer/x86/convolution_3x3.h	`23.94% <0.00%> (-2.77%)`	⬇️
src/layer/x86/reshape_x86.cpp	`92.43% <0.00%> (-2.66%)`	⬇️
src/layer/riscv/deconvolution_packnto1.h	`97.95% <0.00%> (-2.05%)`	⬇️
src/layer/x86/flatten_x86.cpp	`94.76% <0.00%> (-1.80%)`	⬇️
... and 239 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 224040e...7cdbce0. Read the comment docs.

replace vfmacc_vf with vfmacc_vv for better peformance

replace vfmacc_vf with vfmacc_vv for better performance

nihui · 2021-08-17T05:32:36Z

src/layer/riscv/convolution_sgemm_packn_fp16s.h

+#if RVV_SPEC_0_7
+                vfloat16m1_t _v0 = vle16_v_f16m1(tmpptr, vl);
+                vfloat16m1_t _val0 = vrgathervx_float16xm1(_v0, 0, vl);
+                vfloat16m1_t _val1 = vrgathervx_float16xm1(_v0, 1, vl);
+                vfloat16m1_t _val2 = vrgathervx_float16xm1(_v0, 2, vl);
+                vfloat16m1_t _val3 = vrgathervx_float16xm1(_v0, 3, vl);
+                vfloat16m1_t _val4 = vrgathervx_float16xm1(_v0, 4, vl);
+                vfloat16m1_t _val5 = vrgathervx_float16xm1(_v0, 5, vl);
+                vfloat16m1_t _val6 = vrgathervx_float16xm1(_v0, 6, vl);
+                vfloat16m1_t _val7 = vrgathervx_float16xm1(_v0, 7, vl);
+                tmpptr += 8;
+
+                vfloat16m1_t _w0 = vle16_v_f16m1(kptr0, vl);
+                _sum0 = vfmacc_vv_f16m1(_sum0, _val0, _w0, vl);
+                _sum1 = vfmacc_vv_f16m1(_sum1, _val1, _w0, vl);
+                _sum2 = vfmacc_vv_f16m1(_sum2, _val2, _w0, vl);
+                _sum3 = vfmacc_vv_f16m1(_sum3, _val3, _w0, vl);
+                _sum4 = vfmacc_vv_f16m1(_sum4, _val4, _w0, vl);
+                _sum5 = vfmacc_vv_f16m1(_sum5, _val5, _w0, vl);
+                _sum6 = vfmacc_vv_f16m1(_sum6, _val6, _w0, vl);
+                _sum7 = vfmacc_vv_f16m1(_sum7, _val7, _w0, vl);
+
+                kptr0 += packn;
+#else
+                vfloat16m1_t _v0 = vle16_v_f16m1(tmpptr, vl);
+                vfloat16m1_t _val0 = vrgather_vx_f16m1(_v0, 0, vl);
+                vfloat16m1_t _val1 = vrgather_vx_f16m1(_v0, 1, vl);
+                vfloat16m1_t _val2 = vrgather_vx_f16m1(_v0, 2, vl);
+                vfloat16m1_t _val3 = vrgather_vx_f16m1(_v0, 3, vl);
+                vfloat16m1_t _val4 = vrgather_vx_f16m1(_v0, 4, vl);
+                vfloat16m1_t _val5 = vrgather_vx_f16m1(_v0, 5, vl);
+                vfloat16m1_t _val6 = vrgather_vx_f16m1(_v0, 6, vl);
+                vfloat16m1_t _val7 = vrgather_vx_f16m1(_v0, 7, vl);
+                tmpptr += 8;
+
                vfloat16m1_t _w0 = vle16_v_f16m1(kptr0, vl);
-                _sum0 = vfmacc_vf_f16m1(_sum0, val0, _w0, vl);
-                _sum1 = vfmacc_vf_f16m1(_sum1, val1, _w0, vl);
-                _sum2 = vfmacc_vf_f16m1(_sum2, val2, _w0, vl);
-                _sum3 = vfmacc_vf_f16m1(_sum3, val3, _w0, vl);
-                _sum4 = vfmacc_vf_f16m1(_sum4, val4, _w0, vl);
-                _sum5 = vfmacc_vf_f16m1(_sum5, val5, _w0, vl);
-                _sum6 = vfmacc_vf_f16m1(_sum6, val6, _w0, vl);
-                _sum7 = vfmacc_vf_f16m1(_sum7, val7, _w0, vl);
+                _sum0 = vfmacc_vv_f16m1(_sum0, _val0, _w0, vl);
+                _sum1 = vfmacc_vv_f16m1(_sum1, _val1, _w0, vl);
+                _sum2 = vfmacc_vv_f16m1(_sum2, _val2, _w0, vl);
+                _sum3 = vfmacc_vv_f16m1(_sum3, _val3, _w0, vl);
+                _sum4 = vfmacc_vv_f16m1(_sum4, _val4, _w0, vl);
+                _sum5 = vfmacc_vv_f16m1(_sum5, _val5, _w0, vl);
+                _sum6 = vfmacc_vv_f16m1(_sum6, _val6, _w0, vl);
+                _sum7 = vfmacc_vv_f16m1(_sum7, _val7, _w0, vl);

                kptr0 += packn;
+#endif


Stick to rvv-1.0 spec for intrinsic code, and define compatibility alias for rvv-0.7 in riscv_v_071_fix.h

xianyi and others added 8 commits April 24, 2021 20:25

Use RVV spec 0.7.1 for C906.

a0c7725

Fix code style issue.

ad455d9

Merge branch 'master' into riscv64_c906_d1

15f4919

Merge branch 'master' into riscv64_c906_d1

e64543a

Update convolution_sgemm_packn_fp16s.h

7b41bc4

RVV_SPEC_0_7 update

apply code-format changes

7733914

Update convolution_sgemm_packn_fp16s.h

98dae14

replace vfmacc_vf with vfmacc_vv

apply code-format changes

2eb7221

yaobyPerfxlab and others added 6 commits August 16, 2021 17:53

Update convolution_sgemm_packn_fp16s.h

0aafd98

replace vfmacc_vf with vfmacc_vv for better peformance

apply code-format changes

858cd8c

Create convolution_sgemm_packn_fp16s.h

447d0d2

replace vfmacc_vf with vfmacc_vv for better performance

apply code-format changes

63c1a63

Update convolution_sgemm_packn_fp16s.h

a926e61

replace vfmacc_vf with vfmacc_vv for better performance

apply code-format changes

7cdbce0

nihui requested changes Aug 17, 2021

View reviewed changes

nihui closed this Oct 11, 2023

nihui reopened this Oct 11, 2023

github-actions bot added the riscv label Oct 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Riscv64 c906 d1 #3177

Riscv64 c906 d1 #3177

yaobyPerfxlab commented Aug 16, 2021

codecov-commenter commented Aug 16, 2021 •

edited

Loading

nihui Aug 17, 2021

Riscv64 c906 d1 #3177

Are you sure you want to change the base?

Riscv64 c906 d1 #3177

Conversation

yaobyPerfxlab commented Aug 16, 2021

codecov-commenter commented Aug 16, 2021 • edited Loading

Codecov Report

nihui Aug 17, 2021

Choose a reason for hiding this comment

codecov-commenter commented Aug 16, 2021 •

edited

Loading