Skip to content

opencl: fix couple crashes#12795

Merged
max-krasnyansky merged 2 commits intoggml-org:masterfrom
linehill:ocl-fix-crashes
May 21, 2025
Merged

opencl: fix couple crashes#12795
max-krasnyansky merged 2 commits intoggml-org:masterfrom
linehill:ocl-fix-crashes

Conversation

@linehill
Copy link
Copy Markdown
Contributor

@linehill linehill commented Apr 7, 2025

  • fix kernel launches failed on devices which do not support non-uniform work-groups. When non-uniform work-groups are not supported, set local_work_size to NULL (= let driver choose the work-group sizes). This patch does not cover everything - just the cases tested by test-backend-ops.

  • fix sub-buffer creation failed due to cl_buffer_region::origin not being aligned to CL_DEVICE_MEM_BASE_ADDR_ALIGN.

@github-actions github-actions Bot added the ggml changes relating to the ggml tensor library for machine learning label Apr 7, 2025
* fix kernel launches failed on devices which do not support
  non-uniform work-groups. When non-uniform work-groups are not
  supported, set `local_work_size` to NULL (= let driver choose the
  work-group sizes). This patch does not cover everything - just the
  cases tested by test-backend-ops.

* fix sub-buffer creation failed due to `cl_buffer_region::origin` not
  being aligned to `CL_DEVICE_MEM_BASE_ADDR_ALIGN`.
@linehill
Copy link
Copy Markdown
Contributor Author

linehill commented May 5, 2025

Gentle ping, @max-krasnyansky, @lhez. Is this PR good for landing?

@lhez
Copy link
Copy Markdown
Contributor

lhez commented May 5, 2025

Thank you @linehill for the PR. I totally missed this. @max-krasnyansky seems out of office this week, so I think we will need to wait until he comes back.

Regarding the non-uniform workgroup size, what device did you use to observe the crash? Would like to try reproducing the crash.

CL_DEVICE_NON_UNIFORM_WORK_GROUP_SUPPORT is added in OpenCL 3.0 - I suppose it would cause an error (and a hence a crash) on OpenCL 2.0 device, correct?

@linehill
Copy link
Copy Markdown
Contributor Author

linehill commented May 6, 2025

Regarding the non-uniform workgroup size, what device did you use to observe the crash? Would like to try reproducing the crash.

The crash was encountered on Intel iGPU through PoCL's Level0 backend. PoCL-Level0 doesn't (yet) support non-uniform workgroup sizes so clEnqueueNDRangeKernel calls will return an error if the global sizes are not divisible by the local sizes.

CL_DEVICE_NON_UNIFORM_WORK_GROUP_SUPPORT is added in OpenCL 3.0 - I suppose it would cause an error (and a hence a crash) on OpenCL 2.0 device, correct?

I missed this one - it should be fixed now.

@lhez
Copy link
Copy Markdown
Contributor

lhez commented May 14, 2025

@max-krasnyansky ping - I think this PR should be good to merge.

@max-krasnyansky
Copy link
Copy Markdown
Member

Sorry for the delay. Approving & merging now.

@max-krasnyansky max-krasnyansky merged commit edbf42e into ggml-org:master May 21, 2025
42 of 46 checks passed
infil00p pushed a commit to baseweight/llama.cpp that referenced this pull request May 22, 2025
* opencl: fix couple crashes

* fix kernel launches failed on devices which do not support
  non-uniform work-groups. When non-uniform work-groups are not
  supported, set `local_work_size` to NULL (= let driver choose the
  work-group sizes). This patch does not cover everything - just the
  cases tested by test-backend-ops.

* fix sub-buffer creation failed due to `cl_buffer_region::origin` not
  being aligned to `CL_DEVICE_MEM_BASE_ADDR_ALIGN`.

* OpenCL: query non-uniform WG sizes only on OpenCL 3.0+
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
* opencl: fix couple crashes

* fix kernel launches failed on devices which do not support
  non-uniform work-groups. When non-uniform work-groups are not
  supported, set `local_work_size` to NULL (= let driver choose the
  work-group sizes). This patch does not cover everything - just the
  cases tested by test-backend-ops.

* fix sub-buffer creation failed due to `cl_buffer_region::origin` not
  being aligned to `CL_DEVICE_MEM_BASE_ADDR_ALIGN`.

* OpenCL: query non-uniform WG sizes only on OpenCL 3.0+
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants