Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complains about _clfftBakePlan not found when importing #52

Open
yves-surrel opened this issue Apr 21, 2021 · 8 comments
Open

Complains about _clfftBakePlan not found when importing #52

yves-surrel opened this issue Apr 21, 2021 · 8 comments

Comments

@yves-surrel
Copy link

Trying to install gpyfft on a new Mac Mini M1.

Successfully built and compiled sources

python setup.py build
python setup.py install

with no errors.

When importing in python, I get:

In [1]: import gpyfft
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-1-3687f3cada1c> in <module>
----> 1 import gpyfft

~/Downloads/gpyfft-master/gpyfft/__init__.py in <module>
      4 
      5 from .version import __version__
----> 6 from .gpyfftlib import GpyFFT, GpyFFT_Error, Plan
      7 from .fft import *

ImportError: dlopen(/Users/wyselight/Downloads/gpyfft-master/gpyfft/gpyfftlib.cpython-38-darwin.so, 2): Symbol not found: _clfftBakePlan
  Referenced from: /Users/wyselight/Downloads/gpyfft-master/gpyfft/gpyfftlib.cpython-38-darwin.so
  Expected in: flat namespace
 in /Users/wyselight/Downloads/gpyfft-master/gpyfft/gpyfftlib.cpython-38-darwin.so

Any idea ?

@geggo
Copy link
Owner

geggo commented Apr 21, 2021 via email

@yves-surrel
Copy link
Author

yves-surrel commented Apr 21, 2021 via email

@geggo
Copy link
Owner

geggo commented Apr 23, 2021

The error message "Symbol not found: _clfftBakePlan" indicates that the the clFFT library cannot be linked to when importing the wrapper.

You could try 'otool -L gpyfftlib.cpython-38-darwin.so' to check to which libraries the wrapper is linked, and check if they are properly installed.

I don't know how the switch to a new architecture is handled, try checking, e.g. via 'otool -hv yourlib.so' for which architecture the gpyfft and clFFT libs have been built.

hope that helps
Gregor

@yves-surrel
Copy link
Author

You were right, the clFFT library was not linked.

I retried everything from the very beginning, and after more careful examination, I noticed the following when building gpyfft:

g++ -bundle -undefined dynamic_lookup -L/Users/wyselight/miniconda3/lib -arch x86_64 -L/Users/wyselight/miniconda3/lib -arch x86_64 -arch x86_64 build/temp.macosx-10.9-x86_64-3.8/gpyfft/gpyfftlib.o -L/Users/wyselight/clFFT-master/src/library -lclFFT -o build/lib.macosx-10.9-x86_64-3.8/gpyfft/gpyfftlib.cpython-38-darwin.so -stdlib=libc++
ld: warning: ignoring file /Users/wyselight/clFFT-master/src/library/libclFFT.dylib, building for macOS-x86_64 but attempting to link with file built for macOS-arm64

So the problem may come from the new hardware the Mac Mini M1 is running on, and it seems that some architecture flags are not the same for clFFT and gpyFFT. What do you think?

@geggo
Copy link
Owner

geggo commented Apr 27, 2021

Indeed, that explains the failure. It seems the python your are using for building the gpyfft wrapper uses the x86_64 architecture emulation. I read that native python (with numpy!) is available (anaconda with conda-forge channel)

In case you succeed I am interested in the performance, could you please post the result of running

python -m gpyfft.benchmark

@yves-surrel
Copy link
Author

Good news ! I succeeded in installing gpyfft, but I went the other way round, i.e. forcing clFFT to be compiled to the x86_64 arch instead of installing a native python (I fear to have other problems with the numerous libraries I am using). So, after some googling around, I did in the clFFT-master directory:

cd src
CMAKE_OSX_ARCHITECTURES=x86_64 cmake -G "Unix Makefiles"
make
sudo make install

Here is the benchmark for the OpenCL device 1 (GPU) (using device 0 raises an INVALID_WORKGROUP_SIZE error, well known for Macs, I think):

python -m gpyfft.benchmark
Choose platform:
[0] <pyopencl.Platform 'Apple' at 0x7fff0000>
Choice [0]:0
Choose device(s):
[0] <pyopencl.Device 'Apple M1' on 'Apple' at 0xffffffff>
[1] <pyopencl.Device 'Apple M1' on 'Apple' at 0x1027f00>
Choice, comma-separated [0]:1
Set the environment variable PYOPENCL_CTX='0:1' to avoid being asked again.
out of place transforms (1024, 1024) complex64
axes         in out
(-2, -1)     C   C  7.5e-04  0.93ms 112.29 Gflops
(-2, -1)     C   F  7.5e-04  0.77ms 135.68 Gflops
(-2, -1)     F   C  7.5e-04  0.63ms 167.29 Gflops
(-2, -1)     F   F  7.5e-04  0.87ms 120.15 Gflops
(-1, -2)     C   C  7.6e-04  0.86ms 121.51 Gflops
(-1, -2)     C   F  7.6e-04  0.63ms 165.40 Gflops
(-1, -2)     F   C  7.6e-04  0.74ms 142.16 Gflops
(-1, -2)     F   F  7.6e-04  0.72ms 144.66 Gflops
None         C   C  7.6e-04  0.87ms 119.87 Gflops
None         C   F  7.6e-04  0.66ms 159.64 Gflops
None         F   C  7.5e-04  0.61ms 170.71 Gflops
None         F   F  7.5e-04  0.87ms 120.64 Gflops
in place transforms (1024, 1024) complex64
(-2, -1)     C  0.73ms 144.39 Gflops
(-2, -1)     F  0.63ms 167.35 Gflops
(-1, -2)     C  0.60ms 175.42 Gflops
(-1, -2)     F  0.73ms 142.97 Gflops
None         C  0.72ms 144.75 Gflops
None         F  0.74ms 141.04 Gflops

For reference, here is the benchmark on my MBP 15" 2017, on the 'AMD Radeon Pro 555 Compute Engine' openCL device

(base) MacBook-Pro-de-Yves:~ yves$ python -m gpyfft.benchmark
Choose platform:
[0] <pyopencl.Platform 'Apple' at 0x7fff0000>
Choice [0]:
Choose device(s):
[0] <pyopencl.Device 'Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz' on 'Apple' at 0xffffffff>
[1] <pyopencl.Device 'Intel(R) HD Graphics 630' on 'Apple' at 0x1024500>
[2] <pyopencl.Device 'AMD Radeon Pro 555 Compute Engine' on 'Apple' at 0x1021c00>
Choice, comma-separated [0]:2
Set the environment variable PYOPENCL_CTX=':2' to avoid being asked again.
out of place transforms (1024, 1024) complex64
axes         in out
(-2, -1)     C   C  7.4e-04 18.87ms   5.56 Gflops
(-2, -1)     C   F  7.4e-04 28.72ms   3.65 Gflops
(-2, -1)     F   C  7.4e-04 12.65ms   8.29 Gflops
(-2, -1)     F   F  7.4e-04  1.53ms  68.68 Gflops
(-1, -2)     C   C  7.8e-04  1.52ms  69.20 Gflops
(-1, -2)     C   F  7.8e-04 12.63ms   8.30 Gflops
(-1, -2)     F   C  7.8e-04 28.76ms   3.65 Gflops
(-1, -2)     F   F  7.8e-04 18.98ms   5.52 Gflops
None         C   C  7.8e-04  1.52ms  69.10 Gflops
None         C   F  7.8e-04 12.56ms   8.35 Gflops
None         F   C  7.4e-04 12.66ms   8.28 Gflops
None         F   F  7.4e-04  1.52ms  68.79 Gflops
in place transforms (1024, 1024) complex64
(-2, -1)     C 18.82ms   5.57 Gflops
(-2, -1)     F  1.52ms  69.15 Gflops
(-1, -2)     C  1.32ms  79.39 Gflops
(-1, -2)     F  8.85ms  11.85 Gflops
None         C  1.40ms  74.77 Gflops
None         F  1.32ms  79.35 Gflops

So it seems not too bad with this new Mac Mini M1 ;-)

@geggo
Copy link
Owner

geggo commented Apr 27, 2021

Excellent!
Seems you get a decent performance, and your fix to switch architecture for building clFFT is easy to apply. Thanks!

@psobolewskiPhD
Copy link

FYI: you can get arm64 native clFFT from homebrew.
https://formulae.brew.sh/formula/clfft#default
Many native python3 packages are available via miniforge3 conda env and also pip (including numpy, scipy, pyopencl, etc).

I've been using the python sub variant of CLIJ2-clFFT by @bnorthan
https://github.com/clij/clij2-fft
with success for deconvolution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants