Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] LinkableCode not working with CtypesLinker #101

Open
gmarkall opened this issue Jan 2, 2025 · 0 comments
Open

[BUG] LinkableCode not working with CtypesLinker #101

gmarkall opened this issue Jan 2, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@gmarkall
Copy link
Collaborator

gmarkall commented Jan 2, 2025

Describe the bug
Use of LinkableCode subclasses (e.g. PTXSource) is only working with pynvjitlink.

Steps/Code to reproduce bug

Executing the following:

from numba import cuda
import numpy as np


ptx = cuda.PTXSource("""
.version 6.5
.target sm_52
.address_size 64

.visible .func  (.param .b32 status) double_an_int32(
        .param .b64 retval,
        .param .b32 arg
)
{
        .reg .b32       %r<4>;
        .reg .b64       %retvalptr;

        ld.param.u64    %retvalptr, [retval];
        ld.param.u32    %r1, [arg];
        shl.b32         %r2, %r1, 1;
        st.u32          [%retvalptr], %r2;
        mov.u32         %r3, 0;
        st.param.b32    [status], %r3;
        ret;
}""".encode())

double_an_int32 = cuda.declare_device("double_an_int32",
                                      "int32(int32)")


@cuda.jit(link=[ptx])
def k(x):
    x[0] = double_an_int32(x[0])


x = np.ones(1, dtype=np.int32)
print(x[0])
k[1, 1](x)
print(x[0])

results in

$ python repro.py 
1
/home/gmarkall/numbadev/numba-cuda/numba_cuda/numba/cuda/dispatcher.py:579: NumbaPerformanceWarning: Grid size 1 will likely result in GPU under-utilization due to low occupancy.
  warn(NumbaPerformanceWarning(msg))
Traceback (most recent call last):
  File "/home/gmarkall/numbadev/issues/9470/repro.py", line 38, in <module>
    k[1, 1](x)
  File "/home/gmarkall/numbadev/numba-cuda/numba_cuda/numba/cuda/dispatcher.py", line 582, in __call__
    return self.dispatcher.call(args, self.griddim, self.blockdim,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gmarkall/numbadev/numba-cuda/numba_cuda/numba/cuda/dispatcher.py", line 724, in call
    kernel = _dispatcher.Dispatcher._cuda_call(self, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gmarkall/numbadev/numba-cuda/numba_cuda/numba/cuda/dispatcher.py", line 732, in _compile_for_args
    return self.compile(tuple(argtypes))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gmarkall/numbadev/numba-cuda/numba_cuda/numba/cuda/dispatcher.py", line 979, in compile
    kernel.bind()
  File "/home/gmarkall/numbadev/numba-cuda/numba_cuda/numba/cuda/dispatcher.py", line 238, in bind
    self._codelibrary.get_cufunc()
  File "/home/gmarkall/numbadev/numba-cuda/numba_cuda/numba/cuda/codegen.py", line 248, in get_cufunc
    cubin = self.get_cubin(cc=device.compute_capability)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gmarkall/numbadev/numba-cuda/numba_cuda/numba/cuda/codegen.py", line 227, in get_cubin
    self._link_all(linker, cc, ignore_nonlto=False)
  File "/home/gmarkall/numbadev/numba-cuda/numba_cuda/numba/cuda/codegen.py", line 192, in _link_all
    linker.add_file_guess_ext(path, ignore_nonlto)
  File "/home/gmarkall/numbadev/numba-cuda/numba_cuda/numba/cuda/cudadrv/driver.py", line 2758, in add_file_guess_ext
    self.add_data(
    ^^^^^^^^^^^^^
AttributeError: 'CtypesLinker' object has no attribute 'add_data'. Did you mean: 'add_ptx'?

Expected behavior

The code should execute to completion, producing:

$ python repro.py 
1
/home/gmarkall/numbadev/numba-cuda/numba_cuda/numba/cuda/dispatcher.py:579: NumbaPerformanceWarning: Grid size 1 will likely result in GPU under-utilization due to low occupancy.
  warn(NumbaPerformanceWarning(msg))
/home/gmarkall/numbadev/numba-cuda/numba_cuda/numba/cuda/cudadrv/devicearray.py:890: NumbaPerformanceWarning: Host array used in CUDA kernel will incur copy overhead to/from device.
  warn(NumbaPerformanceWarning(msg))
2

which it only does when NUMBA_CUDA_ENABLE_PYNVJITLINK=1.

Environment details (please complete the following information):

numba-cuda 0.2.0, Linux.

Additional context

It's not clear if all the Linkable Code classes should be usable without pynvjitlink, but the following should be done when resolving this issue:

  • Audit the Linkable Code classes to determine which can be made to work with the Ctypes linker
  • Add tests of Linkable Code subclasses that run with both the Ctypes and pynvjitlink linkers.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant