Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for RISC-V Vector Extension #405

Open
wants to merge 2 commits into
base: libpng16
Choose a base branch
from

Conversation

mschlaegl
Copy link

(corresponds to https://sourceforge.net/p/png-mng/mailman/message/37402426/ on the png-mng-implement mailing list)

Hello,

The long awaited RISC-V Vector Extension (RVV) is finally in
ratification.1 This pull request includes two changes:

  • INSTALL: added missing contrib content
    • Contains a simple fix in INSTALL but is necessary for
      applying the main patch
  • libpng: Add support for RISC-V Vector Extension
    • Support for RISC-V Vector Extension

It supports RVV versions: 0.7.1, 0.8, 0.9, 0.10 and 1.0.
Development and performance evaluation of the filter type
implementations (up, sub, avg and paeth) was done on Allwinner D1 with
RVV 0.7.1 using RVVRadar. Details on that can be found in an article
published on github
2.

The integration in libpng was done very similarly as for other
architectures (ARM NEON, MIPS MSA, ...): There exists a configure
switch "riscv-vector" which can be set to no/off, check, api, yes/on.
And there is also run-time checking code (like for ARM NEON,
MIPS MSA, ...).

More important are the differences:

  1. We introduce a separate configure switch "riscv-vector-compat"
  2. We use inline assembler instead of compiler intrinsics

Both is done for the same reason: To support multiple RVV versions.

As mentioned above, RVV 1.0 is under ratification now. But there are
earlier draft versions which are sporadically already implemented in
real existing hardware. One example is the Allwinner D1 with RVV 0.7.1
which was used for performance evaluation and optimization of these
filter implementations.2 We therefore decided to support the
following RVV versions: 0.7.1, 0.8, 0.9, 0.10 and 1.0.

Unfortunately, there is no easy way to detect the RVV version
supported by the toolchain. It was therefore necessary to add an
additional configure switch "riscv-vector-compat" with following
behavior:

  • not-set/off(default): RVV release 1.0
  • set to "0.10": RVV draft 0.10
  • set to "0.9": RVV draft 0.9
  • set to "0.8": RVV draft 0.8
  • set to "0.7.1": RVV draft 0.7.1

Furthermore compiler intrinsics were not available for all supported
RVV versions. For this reason we had to use inline assembler instead
of compiler intrinsics.

The filter implementations for all supported RVV versions were tested
in the following setups:

RVV Version Branch of riscv-gnu-toolchain3 Test system
1.0 "basic-rvv" qemu_sifive 4
0.10 "rvv-intrinsic" qemu_sifive 4
0.9 "rvv-0.9.x" Allwinner D1 (binary compatible)
0.8 "rvv-0.8.x" Allwinner D1 (binary compatible)
0.7.1 "rvv-0.7.1" Allwinner D1 (RVV 0.7.1)

References:

Footnotes

  1. https://riscv.org/announcements/2021/12/riscv-ratifies-15-new-specifications/

  2. https://github.com/mschlaegl/libpng_rvv-doc/blob/main/README.md 2

  3. https://github.com/riscv-collab/riscv-gnu-toolchain

  4. https://github.com/sifive/qemu/tree/rvv-1.0-upstream-v10 2

The long awaited RISC-V Vector Extension (RVV) is finally in
ratification.[1] With this we add support for RVV optimized png filter
types to libpng.

Development and performance evaluation of the filter type
implementations (up, sub, avg and paeth) was done on Allwinner D1 with
RVV 0.7.1 using RVVRadar. Details on that can be found in an article
published on github.[2]

The integration in libpng was done very similarly as for other
architectures (ARM NEON, MIPS MSA, ...): There exists a configure
switch "riscv-vector" which can be set to no/off, check, api, yes/on.
And there is also run-time checking code (like for ARM NEON,
MIPS MSA, ...).

More important are the differences:
 1. We introduce a separate configure switch "riscv-vector-compat"
 2. We use inline assembler instead of compiler intrinsics

Both is done for the same reason: To support multiple RVV versions.

As mentioned above, RVV 1.0 is under ratification now. But there are
earlier draft versions which are sporadically already implemented in
real existing hardware. One example is the Allwinner D1 with RVV 0.7.1
which was used for performance evaluation and optimization of these
filter implementations.[2] We therefore decided to support the
following RVV versions: 0.7.1, 0.8, 0.9, 0.10 and 1.0.

Unfortunately, there is no easy way to detect the RVV version
supported by the toolchain. It was therefore necessary to add an
additional configure switch "riscv-vector-compat" with following
behavior:
 * not-set/off(default): RVV release 1.0
 * set to "0.10": RVV draft 0.10
 * set to "0.9": RVV draft 0.9
 * set to "0.8": RVV draft 0.8
 * set to "0.7.1": RVV draft 0.7.1

Furthermore compiler intrinsics were not available for all supported
RVV versions. For this reason we had to use inline assembler instead
of compiler intrinsics.

The filter implementations for all supported RVV versions were tested
in the following setups:

 RVV     | Branch of              | Test system
 Version | riscv-gnu-toolchain[3] |
---------------------------------------------------------------------
 1.0     | "basic-rvv"            | qemu_sifive [4]
 0.10    | "rvv-intrinsic"        | qemu_sifive [4]
 0.9     | "rvv-0.9.x"            | Allwinner D1 (binary compatible)
 0.8     | "rvv-0.8.x"            | Allwinner D1 (binary compatible)
 0.7.1   | "rvv-0.7.1"            | Allwinner D1 (RVV 0.7.1)

References:
-----------
[1] https://riscv.org/announcements/2021/12/riscv-ratifies-15-new-specifications/
[2] https://github.com/mschlaegl/libpng_rvv-doc/blob/main/README.md
[3] https://github.com/riscv-collab/riscv-gnu-toolchain
[4] https://github.com/sifive/qemu/tree/rvv-1.0-upstream-v10
@ioquatix
Copy link
Contributor

I would welcome your patch in https://github.com/libpng/libpng

@mschlaegl
Copy link
Author

mschlaegl commented Jul 27, 2022

Thank you!
A new pull request is now created: libpng/libpng#3

@jbowler
Copy link
Contributor

jbowler commented Dec 29, 2023

@ctruta; I put a review on libpng/libpng#3

Summary is that there is some work to be done but the code apparently only works with linux and probably doesn't work on Android (this used to be linux but with no access to /proc/cpuinfo). There's going to be a big maintenance overhead with this code, in particular the stuff in 'contrib'.

I'm of the opinion that a run-time test is a bad idea; the person building libpng should know what the target hardware is and since, apparently, RISC-V has "options" the test on __riscv_vector doesn't work reliably. Possibly compiler support is incomplete.

@ctruta
Copy link
Member

ctruta commented Feb 26, 2024

@mschlaegl @dragostis

On the topic of the long-awaited RISC-V vector extension, I would like to tell you that the long-awaited time to have it integrated into libpng is finally here. It took me a loooong time (apologies!) to finish up other tasks, and I had the LoongArch LSX integrated in the meantime (with them waiting for years also...)

Thank you for your contributions. (And also, thank you @jbowler for the review.) I would like to merge a combination of both of your contributions. For now, I will close Merge Request 9 on SourceForge, and focus on this one, but I will acknowledge both of you.

Moving on, I have a couple of observations to make.

Firstly, I can see that Manfred's contribution has support for RVV draft 0.7.1, 0.8, 0.9, and release 1.0, whereas the contribution made by Dragoș only works with release 1.0 (or, at least, that is my impression). My inclination is to pick up the implementation made by Dragoș, because (I would presume) release 1.0 is the one that's important, going forward. However, I would like to defer to you the decision and its rationale.

Secondly, I can see that Dragoș submitted an optimized implementation for palette images, which is a nice extra. It appears to be modelled after ARM's riffled palette optimization, which is good. Again, I'd like to pick that one; but please let me know Manfred if you agree.

Thirdly, I can see that Manfred's implementation uses inline assembly, which appears to work alright, whereas Dragoș used C intrinsics via the riscv_vector.h header. I cannot test what Dragoș did, unfortunately, because I cannot compile it at all on my test system (Linux Mint 21, based on Ubuntu 22.04, with the riscv64-linux-gnu-gcc cross-compiler and the RISC-V build of Ubuntu 22.04 running in QEMU.) I personally tend to be in favour of C implementations with intrinsics, rather than inline assembly; but Manfred's implementation appears to work outright, whereas I appear to have difficulties with the implementation made by Dragoș.

Am I missing anything?

Both of your submissions can no longer apply to the present-day tip-of-tree. They need to be reworked, if only a little bit. I would highly appreciate it if you could please resubmit, and also let me know about your testing setup that I could reproduce on my machines. Your contributions are next in line for integration, really, I pinky-promise :-)

--
Cosmin

@ctruta ctruta closed this Feb 26, 2024
@ctruta ctruta reopened this Feb 26, 2024
@ctruta
Copy link
Member

ctruta commented Feb 26, 2024

So I pressed the "Close PR" button. (Oops!) I just reopened it.

@dragostis
Copy link

@ctruta, thanks for taking a look!

I no longer work at Google, but I'm happy to tell you how I got it working:

  1. Since RISC-V Vector support was not yet merged into Linux, it required merging a patch by SiFive that added support. Merging it and compiling the kernel was quite straight-froward as was updating the kernel to the patched version. This might not be needed now if the patch has already been merged ofc.
  2. clang 17 and GCC trunk support the intrinsics out of the box. More details here: https://github.com/riscv-non-isa/rvv-intrinsic-doc

If there's anything else I can help with, please let me know.

@ctruta
Copy link
Member

ctruta commented Feb 27, 2024

Since RISC-V Vector support was not yet merged into Linux, it required merging a patch by SiFive that added support. Merging it and compiling the kernel was quite straight-froward as was updating the kernel to the patched version. This might not be needed now if the patch has already been merged ofc.

Good to know. Well, I didn't have too much success with Debian, and for some reason the RISC-V cross-compilers are not available in Fedora. Fortunately, the next Ubuntu LTS is almost here, and I expect to have an easier time to just run a conventional Ubuntu 24.04 on my laptop, with QEMU on top, with minimal effort. (I do some Linux hacking every now and then, but I'm not that much of a Linux hacking expert, so waiting for their April release should be a sound plan for me.)

clang 17 and GCC trunk support the intrinsics out of the box. More details here: https://github.com/riscv-non-isa/rvv-intrinsic-doc

Also good to know. Then my plan with Ubuntu 24.04 should work even better, because clang 17 is out, and gcc 14 is slated to be in this upcoming distro. (According to DistroWatch.)

If there's anything else I can help with, please let me know.

Thank you. For now, I'd like to see what @mschlaegl has to say also. If I will prepare a commit, later, or if somebody else contributes a commit, then I might actually ping you @dragostis if you'd like to take an extra look on top of our usual reviews.

I no longer work at Google, but I'm happy to tell you how I got it working:

I guess the thing to say here is "good luck in your new adventure" 😉

@jbowler
Copy link
Contributor

jbowler commented Sep 19, 2024

@ctruta: libpng18: we need to isolate target specific code from the core of libpng and guarantee ongoing maintenance from the contributors. This can be done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants