SYCL runtime: Severe host overhead in sycl::get_kernel_bundle

For platform compatible, we didn't use device max work group size to launch kernel, and switch to query specific max work group size for kernel by SYCL API. following is our code example
```
  auto kid = ::sycl::get_kernel_id<KernelClass>();
  auto kbundle = ::sycl::get_kernel_bundle<::sycl::bundle_state::executable>(
      ctx, {dev}, {kid});
  ::sycl::kernel k = kbundle.get_kernel(kid);
  int max_work_group_size =  k.get_info<::sycl::info::kernel_device_specific::work_group_size>(dev); 
```
We found this usage takes much host overhead in application. we measured one kernel CPU performance here, each API name in table maps example code:
<html xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns="http://www.w3.org/TR/REC-html40">

<head>

<meta name=ProgId content=Excel.Sheet>
<meta name=Generator content="Microsoft Excel 15">
<link id=Main-File rel=Main-File
href="file:///C:/Users/majing/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
<link rel=File-List
href="file:///C:/Users/majing/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
<style>

</style>
</head>

<body link="#467886" vlink="#96607D">


API | get_kernel_id | get_kernel_bundle | get_kernel | get_info
-- | -- | -- | -- | --
time   (us) | 0.434 | 42.481 | 4.241 | 1.125



</body>

</html>



We also file internal jira to track this issue.  Can you help evaluate this slow performance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SYCL runtime: Severe host overhead in sycl::get_kernel_bundle #15824

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SYCL runtime: Severe host overhead in sycl::get_kernel_bundle #15824

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions