diff --git a/sycl/doc/design/VirtualFunctions.md b/sycl/doc/design/VirtualFunctions.md new file mode 100644 index 0000000000000..572977f8bd4c9 --- /dev/null +++ b/sycl/doc/design/VirtualFunctions.md @@ -0,0 +1,539 @@ +# Implementation design for sycl_ext_oneapi_virtual_functions + +Corresponding language extension specification: +[sycl_ext_oneapi_virtual_functions][1] + +## Overview + +Main complexity of the feature comes from its co-existence with optional kernel +features ([SYCL 2020 spec][sycl-spec-optional-kernel-features], +[implementation design][optional-kernel-features-design]) mechanism. Consider +the following example: + +```c++ +using syclext = sycl::ext::oneapi::experimental; + +struct set_fp64; + +struct Base { + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) + void foo() {} + + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable_in) + void bar() { + // this virtual function uses double + double d = 3.14; + } +}; + +class Constructor; +class Use; +class UseFP64; + +int main() { + // Selected device may not support 'fp64' aspect + sycl::queue Q; + + Base *Obj = sycl::malloc_device(1, Q); + int *Result = sycl::malloc_shared(2, Q); + + Q.single_task([=] { + // Only placement new can be used within device functions. + // When an object of a polymorphic class is created, its vtable is filled + // with pointer to virtual member functions. However, we don't always know + // features supported by a target device (in case of JIT) and therefore + // can't decide whether both 'foo' and 'bar' should be included in the + // resulting device image - the decision must be made at runtime when we + // know the target device. + new (Obj) Base; + }); + + // The same binary produced by a sycl compiler should correctly work on both + // devices with and without support for 'fp64' aspect. + Q.single_task(syclext::properties{syclext::assume_inddirect_calls}, [=]() { + Obj->foo(); + }); + + if (Q.get_device().has(sycl::aspect::fp64)) { + Q.single_task(syclext::properties{syclext::assum_indirect_calls_to}, + [=]() { + Obj->bar(); + }); + } + + return 0; +} +``` + +As comments in the snippet say the main issue is with vtables: at compile time +it may not be clear which exact functions can be safely included in there and +which are not in order to avoid speculative compilation and fulfill optional +kernel features requirements from the SYCL 2020 specification. + +To solve this, the following approach is used: all virtual functions marked with +`indirectly_callable_in` property are grouped by set they belong to and outlined +into separate device images (i.e. device images with kernels using them are left +with declarations only of those virtual functions). + +For each device image with virtual functions that use optional features we also +create a "dummy" version of it where bodies of all virtual functions are +emptied. + +Dependencies between device images are recorded in properties based on +`assume_indirect_calls_to` and `indirectly_callable_in` properties. They are +used later by runtime to link them together. Device images which depend on +optional kernel features are linked only if those features are supported by a +target device and dummy versions of those device images are used otherwise. + +This way we can emit single unified version of LLVM IR where vtables reference +all device virtual functions, but their definitions are outlined and linked +back dynamically based on device capabilities. + +For AOT flow, we don't do outlining and dynamic linking, but instead do direct +cleanup of virtual functions which are incompatible with a target device. + +## Design + +### Changes to the SYCL header files + +New compile-time properties `indirectly_callable_in` and +`assume_indirect_calls_to` should be implemented in accordance with the +corresponding [design document][2]: + +- `indirectly_callable_in` property should lead to emission of + `"indirectly-callable"="set"` function attribute, where "set" is a string + representation of the property template parameter. +- `assume_indirect_calls_to` property should lead to emission of + `"calls-indirectly"="set1,set2"`, where "set1" and "set2" are string + representations of the property template parameters. + +In order to convert a type to a string, [\__builtin_sycl_unique_stable_name][3] +could be used. + +The `assume_indirect_calls_to` compile-time property accepts a list of types +which identify virtual functions set. It can be handled using metaprogramming +magic to compile-time concatenate strings to produce a single value out of a set +of parameters. Similar approach is used to handle `reqd_work_group_size` and +other compile-time properties that accept integers: + +```c++ +// Helper to hide variadic list of arguments under a single type +template struct CharList {}; + +// Helper to concatenate several lists of characters into a single string. +// Lists are separated from each other with comma within the resulting string. +template struct ConcatenateCharsToStr; + +// Specialization for a single list +template struct ConcatenateCharsToStr> { + static constexpr char value[] = {Chars..., '\0'}; +}; + +// Specialization for two lists +template +struct ConcatenateCharsToStr, CharList> + : ConcatenateCharsToStr> {}; + +// Specialization for the case when there are more than two lists +template +struct ConcatenateCharsToStr, CharList, + Rest...> + : ConcatenateCharsToStr, + Rest...> {}; + +// Helper to convert type T to a list of characters representing the type (its +// mangled name). +template struct StableNameToCharsHelper { + using chars = CharList<__builtin_sycl_unique_stable_name(T)[Indices]...>; +}; + +// Wrapper helper for the struct above +template struct StableNameToChars; + +// Specialization of that wrapper helper which accepts sequence of integers +template +struct StableNameToChars> + : StableNameToCharsHelper {}; + +// Top-level helper, which should be used to convert list of typenames into a +// string that contains comma-separated list of their string representations +// (mangled names). +template struct PropertyValueHelper { + static constexpr const char *name = "my-fancy-attr"; + static constexpr const char *value = + ConcatenateCharsToStr>::chars...>::value; +}; + +// Example usage: +SYCL_EXTERNAL +[[__sycl_detail__::add_ir_attributes_function( + PropertyValueHelper::name, + PropertyValueHelper::value)]] void +foo() { + // Produced LLVM IR: + // define void @_Z3foov() #0 { ... } + // attributes #0 = { "my-fancy-attr"="_ZTSv,_ZTSi" ... } +} + +``` + +### Changes to the compiler front-end + +Most of the handling for virtual functions happens in middle-end and thanks to +compile-time properties, no extra work is required to propagate necessary +information down to passes from headers. + +However, we do need to filter out those virtual functions which are not +considered to be device as defined by the [extension specification][1], such +as: + +- virtual member functions annotated with `indirectly_callable_in` compile-time + property should be emitted into device code; +- virtual member function *not* annotated with `indirectly_callable_in` + compile-time property should *not* be emitted into device code; + +To achieve that, the front-end should implicitly add `sycl_device` attribute to +each function which is marked with the `indirectly_callable_in` attribute. This +can be done during handling of `[[__sycl_detail__::add_ir_attributes_function]]` +attribute by checking if one of string literals passed in there is an attribute +name argument name to "indirectly_callable". Later the `sycl_device` attribute +can be used to decide if a virtual function should be emitted into device code. + +When emitting virtual calls, front-end should emit an extra `virtual-call` LLVM +IR attribute at every call site. This attribute will be used by a middle-end +pass to check that there are no virtual function calls in kernels _not_ marked +with the `calls_indirectly` property and emit a diagnostic about that. + +### Changes to the compiler middle-end + +#### Aspects propagation + +Aspects propagation pass should be extended to not only gather aspects which are +used directly, but also aspects that are used indirectly, through virtual +functions. + +For that the pass should compile a list of aspects used by each set of +indirectly callable functions (as defined by `indirectly_callable_in` property +set by user) and then append those aspects to every kernel which use those sets +(as defined by `assume_indirect_calls_to` property set by user). + +**TODO**: should we consider outlining "indirectly used" aspects into a separate +metadata and device image property? This should allow for more precise and +user-friendly exceptions at runtime + +NOTE: if the aspects propagation pass is ever extended to track function +pointers, then aspects attached to virtual functions **should not** be attached +to kernels using this mechanism. For example, if a kernel uses a variable, +which is initialized with a function pointer to a virtual function which uses +an aspect, then such kernel **should not** be considered as using that aspect. +Properties-based mechanism which is described above should be used for aspects +propagation for virtual functions. + +To illustrate this, let's once again consider the example from Overview section +which is copied below for convenience: + +```c++ +using syclext = sycl::ext::oneapi::experimental; + +struct set_fp64; + +struct Base { + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable + void foo() {} + + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable_in) + void bar() { + // this virtual function uses double + double d = 3.14; + } +}; + +class Constructor; +class Use; +class UseFP64; + +int main() { + // Selected device may not support 'fp64' aspect + sycl::queue Q; + + Base *Obj = sycl::malloc_device(1, Q); + int *Result = sycl::malloc_shared(2, Q); + + Q.single_task([=]() { + // Even though at LLVM IR level this kernel does reference 'Base::foo' + // and 'Base::bar' through global variable containing `vtable` for `Base`, + // we do not consider the kernel to be using `fp64` optional feature. + new (Obj) Base; + }); + + Q.single_task(syclext::properties{syclext::assume_indirect_calls}, [=]() { + // This kernel is not considered to be using any optional features, because + // virtual functions in default set do not use any. + Obj->foo(); + }); + + if (Q.get_device().has(sycl::aspect::fp64)) { + Q.single_task(syclext::properties{syclext::assume_indirect_calls_to}, + [=]() { + // This kernel is considered to be using 'fp64' optional feature, because + // there is a virtual function in 'set_fp64' which uses double. + Obj->bar(); + }); + } + + return 0; +} +``` + +This way, "Constructor" kernel(s) won't pull optional features +requirements from virtual functions it may reference through vtable, making it +independent from those. This allows to launch such kernels on wider list of +devices, even though there could be virtual functions which require optional +features. + +"Use" kernel(s) do pull optional features requirements from virtual functions +they may call through `calls_indirectly` property and associated sets. This +enables necessary runtime diagnostics that a kernel is not submitted to a device +which doesn't support all required optional features. + +#### New compiler diagnostics + +A new pass should be added to analyze virtual calls and emit diagnostics if a +kernel without the `assume_indirect_calls_to` property performs a virtual call +and emit a diagnostic about that. `virtual-call` LLVM IR attribute we attach to +such call instructions should help us with detecting those calls. + +The pass should be launched somewhere at the beginning of the optimization +pipeline so that LLVM IR is as close to the input source file as possible for +better diagnostics. + +#### Device code split and device images + +The extension specification restricts implementation from raising a diagnostic +when a kernel that is not marked with `calls_indirectly` kernel property creates +an object of a polymorphic class where some virtual functions use optional +kernel features incompatible with a target device. + +Consider the following example: + +```c++ +using syclext = sycl::ext::oneapi::experimental; + +struct fp64_set; +struct regular_set; + +struct Foo { +virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable_in) void foo() { + // uses double + double d = 3.14; +} + +virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable_in) void bar() {} +}; + +sycl::queue q; + +auto *Storage = sycl::malloc_device(1, q); + +q.single_task([=] { + // The kernel is not submitted with 'assume_indirect_calls_to' property and therefore + // it is not considered to be using any of virtual member functions of 'Foo'. + // This means that the object of 'Foo' can be successfully created by this + // kernel, regardless of whether a target device supports 'fp64' aspect which + // is used by 'Foo::foo'. + // No exceptions are expected to be thrown. + new (Storage) Foo; +}); + +if (q.get_device().has(sycl::aspect::fp64)) { + auto props = syclext::properties{syclext::assume_indirect_calls_to}; + q.single_task(props, [=]() { + Storage->foo(); + }); +} else { + auto props = syclext::properties{syclext::assume_indirect_calls_to}; + q.single_task(props, [=]() { + Storage->bar(); + }); +} +``` + +This example should work regardless of whether target device supports 'fp64' +aspect or not. Implementation differs for JIT and AOT flows. + +##### JIT flow + +Regardless of device code split mode selected by a user, functions marked with +`indirectly_callable_in` property should be outlined into separate device images +by `sycl-post-link` tool based on the argument of the `indirectly_callable_in` +property, i.e. all functions from the same set should be bundled into a +dedicated device image. + +**TODO**: as an optimization, we can consider preserving virtual functions from +sets that do not use any optional kernel features. + +Virtual functions in the original device image should be turned into +declarations instead of definitions. + +Additionally, if any virtual function in such device image uses any optional +kernel features, then the whole image should be cloned with all function bodies +emptied. This cloned device image will be further referred to as "dummy virtual +functions device image". + +This dummy device image is needed to support the example showed above when a +kernel creates an object of a polymorphic class where some of virtual functions +use optional features. LLVM IR generated by front-end will contain a vtable, +which references all methods of the class. However, not all of them can be +directly included into kernel's device image to avoid speculative compilation. + +When such kernel is submitted to a device, runtime will check which optional +features are supported and link one or another device image with virtual +functions. + +##### AOT flow + +In AOT mode, there will be no dynamic linking, but at the same time we know the +list of supported optional features by a device thanks to +[device config file][device-config-file-design]. + +Therefore, `sycl-post-link` should read the device config file to determine list +of optional features supported by a target and based on that drop all virtual +functions from sets that use unsupported optional features. + +Note that we are making decisions not based on which aspects are used by each +individual virtual functions, but based on which aspects are used by a set of +virtual functions (as identified by the `indirectlly_callable` property +argument). The latter is computed as conjunction of aspects used by each +virtual function within a set. + +The behavior is defined this way to better match the extension specification +which defines virtual functions availability in terms of whole sets and not +individual functions. + +#### New device image properties + +To let runtime know which device images should be linked together to get virtual +functions working, new property set is introduced: "SYCL/virtual functions". + +NOTE: in AOT mode, every device image is already self-contained and contains +the right (supported by a device) set of virtual functions in it. Therefore, we +do not need to emit any of those properties when we are in AOT mode. + +For device images, which contain virtual functions (i.e. ones produced by +outlining `indirectly_callable_in` functions into a separate device image), the +following properties are set within the new property set: +- "virtual-functions-set" with a string value containing name of virtual + functions set contained within the image (value of the property argument); +- "dummy-image=1" if an image is a dummy virtual functions device image; + +For other device images (i.e. ones containing actual user-provided kernels): +- "uses-virtual-functions-set" with a string value containing comma-separated + list of names of virtual function sets used by kernels in the image. + +For the purposes of generating "uses-virtual-functions-set" device image +property value the fact that kernel uses a set of virtual functions is inferred +based on two things: +- kernel is set to explicitly use a set of virtual functions through + `assume_indirect_calls_to` property; +- kernel constructs an object of a polymorphic class and thus references vtable + global variable which in turn references functions that belong to some sets; + +### Changes to the runtime + +When a kernel submitted to a device comes from a device image with some +properties set in "SYCL/virtual functions" property set, then runtime does some +extra actions to link several device images together to ensure that the kernel +can be executed. + +Let's say that a submitted kernel is from device image that has property +"uses-virtual-functions-set=A,B,...,N" on it, then the following other device +images are linked together with it: +- all device images with "virtual-functions-set" property equal to "A", "B", + ..., "N" and *without* "dummy-image=1" property on it: + - if that device image is compatible with a device, it is taken to be linked + with the initial device image; + - otherwise, runtime looks for a device image with the same + "virtual-functions-set" property, but *with* "dummy-image=1" property on it + and takes that device image to be linked with the initial device image; +- all other device images with "uses-virtual-functions-set" property equal to + "A", "B", ..., "N" if they are compatible with a device. Note that this + triggers further recursive search for device images that should be linked + together, i.e. runtime should keep track of which device images have already + been looked at to avoid entering an infinite recursion; + +If for any used virtual functions set there is no device image that provides +virtual functions from it, the runtime should throw an exception, because that +is likely a user error (missing or misspelled `indirectly_callable_in` property +on a virtual function). + +Produced list of device images is then linked together and used to enqueue a +kernel. + +NOTE: when shared libraries are involved, they could also provide some +`indirectly_callable_in` functions in the same sets as application. This means +that there could be more than one image registered with the same value of +"virtual-functions-set" property. + +#### In-memory cache of kernels and programs + +It is very important that all kernels that use virtual functions from the same +set and operate (construct and perform calls) on the same objects are bundled +into the same program. If that program changes somewhere in between an object +construction and virtual call, it will lead to undefined behavior because of +invalidated vtable pointers. + +Therefore, in-memory cache eviction mechanism should be updated not to evict +kernels that use virtual functions, because otherwise it will lead to functional +issues. + +NOTE: in our experience we have only encountered a situation where in-memory +cache eviction was required with SYCL CTS test for specialization constants, +which is very heavy. Therefore, it is not expected that any changes to in-memory +cache eviction mechanism will be needed any time soon. + +## Design alternatives + +Discussions over this feature resulted in suggestion for an alternative +implementation that would lift some of the usage restrictions of virtual +functions, but they require more time for investigation and analysis than we +currently have and therefore information below is recorded as a potential +future changes to this design. + +### Do not record an absolute address of a vtable in an object + +One of the significant limitations of the design outlined above is that if a +device image got recompiled in-between object creation and virtual call, then +vtable pointer stored in an object is invalidated. Such re-compilation could +happen if specialization constant value was changed, for example. + +As a possible solution to lift that limitation, we could have recorded an index +of a vtable instead of its address into an object. We will need to change the +LLVM IR we emit for object construction and making virtual function call, but +it will allow to avoid invalidating of vtable pointer on device image +recompilation. + +To introduce an order to vtables, we could generate a couple of helper functions +to map between vtable and its index and vice-versa. + +Theoretically, this solution could be extended further to make sure that vtable +index is still accessible even if an object is passed between different device +images: if we make sure to include every vtable into every device image and +somehow maintain the stable order of those. + +There are many questions that need to be explored and answered and therefore +this implementation design is not being immediately proposed, but it sounds like +a promising direction to lift some of existing limitations and improve user +experience. + +[1]: <../extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc> +[2]: +[3]: https://clang.llvm.org/docs/LanguageExtensions.html#builtin-sycl-unique-stable-name +[sycl-spec-optional-kernel-features]: https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:optional-kernel-features +[optional-kernel-features-design]: +[device-config-file-design]: + diff --git a/sycl/doc/extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc new file mode 100644 index 0000000000000..09bc0d51cbba1 --- /dev/null +++ b/sycl/doc/extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc @@ -0,0 +1,714 @@ += sycl_ext_oneapi_virtual_functions + +:source-highlighter: coderay +:coderay-linenums-mode: table + +// This section needs to be after the document title. +:doctype: book +:toc2: +:toc: left +:encoding: utf-8 +:lang: en +:dpcpp: pass:[DPC++] + +// Set the default source code type in this document to C++, +// for syntax highlighting purposes. This is needed because +// docbook uses c++ and html5 uses cpp. +:language: {basebackend@docbook:c++:cpp} + + +== Notice + +[%hardbreaks] +Copyright (C) 2024-2024 Intel Corporation. All rights reserved. + +Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks +of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by +permission by Khronos. + + +== Contact + +To report problems with this extension, please open a new issue at: + +https://github.com/intel/llvm/issues + + +== Dependencies + +This extension is written against the SYCL 2020 revision 8 specification. All +references below to the "core SYCL specification" or to section numbers in the +SYCL specification refer to that revision. + +This extension also depends on the following other SYCL extensions: + +* link:../experimental/sycl_ext_oneapi_kernel_properties.asciidoc[ + sycl_ext_oneapi_kernel_properties] +* link:../experimental/sycl_ext_oneapi_properties.asciidoc[ + sycl_ext_oneapi_properties] +* link:../experimental/sycl_ext_oneapi_named_sub_group_sizes.asciidoc[ + sycl_ext_oneapi_named_sub_group_sizes] + +== Status + +This is a proposed extension specification, intended to gather community +feedback. Interfaces defined in this specification may not be implemented yet +or may be in a preliminary state. The specification itself may also change in +incompatible ways before it is finalized. *Shipping software products should +not rely on APIs defined in this specification.* + +== Backend support status + +The APIs in this extension may be used only on a device that has +`aspect::ext_oneapi_virtual_functions`. The application must check that the +device has this aspect before submitting a kernel using any of the APIs in this +extension. If the application fails to do this, the implementation throws +a synchronous exception with the `errc::kernel_not_supported` error code +when the kernel is submitted to the queue. + +== Overview + +The main purpose of this extension is to reduce amount of SYCL language +restrictions for device code by allowing to call virtual member functions +from device functions. + +NOTE: this extension **does not** cover (i.e. doesn't enable) things like +`dynamic_cast`, `typeid` or calls through function pointers. + +== Specification + +=== Feature test macro + +This extension provides a feature-test macro as described in the core SYCL +specification. An implementation supporting this extension must predefine the +macro `SYCL_EXT_ONEAPI_VIRTUAL_FUNCTIONS` to one of the values defined in the +table below. Applications can test for the existence of this macro to determine +if the implementation supports this feature, or applications can test the +macro's value to determine which of the extension's features the implementation +supports. + +[%header,cols="1,5"] +|=== +|Value +|Description + +|1 +|The APIs of this experimental extension are not versioned, so the + feature-test macro always has this value. +|=== + +=== New language restrictions for device functions + +The following restriction, listed in section 5.4 of the core SYCL specification +does not apply to kernels submitted with the `assume_indirect_calls_to` and +`assume_indirect_calls` properties: + +> The odr-use of polymorphic classes and classes with virtual inheritance is +> allowed. *However, no virtual member functions are allowed to be called in a +> device function.* + +However, there are still some limitations of how virtual member functions can +be used: + +- if an object is constructed in host code, calling a virtual member function + for that object in device code has undefined behavior; +- if an object is constructed in device code on a device `A`, calling a virtual + member function for that object in host code, or on another device `B` has + undefined behavior; + +=== New properties + +Under the hood virtual functions are essentially function pointers which are +stored in a global variable and managed by compiler-generated code. Therefore, +each call to a virtual member function is an indirect call and compiler may not +be able to understand which exact virtual function is being called (i.e. which +class it belongs to). + +Without any knowledge about which virtual function can be called from which +kernels compiler will have to make all virtual functions available to all +kernels. That may not be desirable because some of those virtual functions could +use features that are prohibited in device code. + +In order to help compiler to build a mapping between kernels and virtual +functions they may call, the extension introduces new compile-time-constant +properties. + +[source,dpcpp] +---- +namespace sycl::ext::oneapi::experimental { + + struct indirectly_callable_key { + template + using value_t = property_value; + }; + + struct calls_indirectly_key { + template + using value_t = property_value; + }; + + inline constexpr indirectly_callable_key::value_t indirectly_callable; + + template + inline constexpr indirectly_callable_key::value_t + indirectly_callable_in; + + inline constexpr calls_indirectly_key::value_t assume_indirect_calls; + + template + inline constexpr calls_indirectly_key::value_t + assume_indirect_calls_to; + + template <> + struct is_property_key : std::true_type {}; + template <> struct is_property_key : std::true_type {}; +} +---- + +Before describing those properties in more detail, a couple of new terms are +introduced to simplify the extension specification: + +Set of virtual member functions:: a group of virtual member functions which are +defined with the `indirectly_callable` property and with the same value of the +property parameter `SetId`. For simplicity, this will also be further referred +to as a _set_, or as a _set of virtual functions_. + +Kernel declares a use of a set of virtual member functions:: a kernel is +considered to be declaring a use of a set of virtual member functions `SetIdA` +when it is submitted with `calls_indirectly` property with `SetIdA` included +into the property parameter `SetIds`. If `SetIdA` is not included into the +property parameter `SetIds`, or if a kernel is submitted without the property, +then it is *not* considered to be declaring a use of the set of virtual member +functions. + +|=== +|Property|Description +|`indirectly_callable` +|This is an alias to `indirectly_callable_in`, please read the description +of the `indirectly_callable_in` property for full documentation. + +This property is expected to be used in situations where application is not that +huge and/or complex and therefore doesn't care about having more than one set +of virtual functions. + +Going forward, the document will only reference the `indirectly_callable_in` +property, but whatever is said about it also applies to the +`indirectly_callable` property because it is a simple alias. +|`indirectly_callable_in` +|The `indirectly_callable_in` property indicates that a virtual member function +is a device function, thus making it available to be called from SYCL kernel and +device functions. Should only be applied to virtual member functions and to do +so, function-style `SYCL_EXT_ONEAPI_FUNCTION_PROPERTY` macro should be used. + +NOTE: This property affect a particular function and does not impact any of its +overrides in derived classes. If the whole hierarchy of overrides is expected +to be callable from a device, then each and every override should be marked with +the property. + +Parameter `SetId` specifies a set of virtual member functions this function +belongs to and at the same time it defines a group of kernels, which can call +this function, it must be a C++ typename. + +Calling a virtual member function from a kernel which does not declare use of a +set the virtual member function belongs to is an undefined behavior. + +The property must appear on the first declaration of the function in the +translation unit. Redeclarations of the function may optionally be decorated +with the same property if the property argument is the same. The effect is the +same regardless of whether redeclarations are so decorated. + +If a function is decorated with one of these properties in one translation unit, +any other translation unit that declares the same function must also decorate +the function with the same property (with the same argument). Otherwise the +program is considered ill-formed, but no diagnostic is required. + +The programs that decorate the same function with multiple instances of the +property with different argument are ill formed. +|`assume_indirect_calls` +|This is an alias to `assume_indirect_calls_to`, please read the +description of the `assume_indirect_calls_to` property for full documentation. + +This property is expected to be used in situations where application is not that +huge and/or complex and therefore doesn't care about having more than one set +of virtual functions. + +Going forward, the document will only reference the `assume_indirect_calls_to` +property, but whatever is said about it also applies to the +`assume_indirect_calls` property because it is a simple alias. +|`assume_indirect_calls_to` +|The `assume_indirect_calls_to` property indicates that a SYCL kernel function +may perform calls through virtual member functions and declares use of one or +more sets of virtual member functions. + +Parameter `SetIds` specifies which sets of virtual member functions are +declared to be used by a kernel, it must be zero or more C\++ typenames. + +Calling a virtual member function, which does not belong to any of sets of +virtual member functions declared to be used is an undefined behavior. + +This property should be attached to a kernel if it contains a virtual member +function call in its call graph, even if the said function is never actually +called. If a kernel submitted without this property contains a virtual member +function call in its call graph, diagnostic should be emitted by an +implementation. +|=== + +If a kernel is submitted with the `assume_indirect_calls_to` property that +points to an empty set of virtual functions, a synchronous exception with the +`errc::invalid` error code should be thrown by an implementation. + +Applying the `indirectly_callable_in` property to a SYCL Kernel function is +illegal and an implementation should produce a diagnostic for that. + +Applying the `indirectly_callable_in` property to an arbitrary device function, +which is not a virtual member function has no effect. + +NOTE: This behavior may be changed in either future version of this extension or +in another extensions. + +Virtual member functions that are decorated with the `indirectly_callable_in` +property are considered to be device functions, i.e. they must obey the +restrictions listed in section 5.4 of the core SYCL specification "Language +restrictions for device functions". Virtual member functions that are not +decorated with this attribute do not need to obey these restrictions, even if +other definitions of that virtual member function in other classes in the +inheritance hierarchy are decorated with the attribute. + +[source,dpcpp] +---- +using syclext = sycl::ext::oneapi::experimental; + +struct set_A; +struct set_B; + +class Foo { +public: + // properties to functions should be applied using the macro: + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable_in) void + foo() {} + + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable_in) void + bar(); + + // first declaration must be annotated + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable_in) void + baz(); +}; + +// redeclarations may be annotated as well +void SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable_in) +Foo::baz() {} + +// but it is not required +Foo::bar() {} + +int main() { + sycl::queue q; + // kernel calling virtual function should also be annotated: + q.single_task(syclext::properties{syclext::assume_indirect_calls_to}, + [=]() { + Foo *ptr = /* ... */; + ptr->bar() + + // Note: this kernel can only call 'Foo::foo' and 'Foo::bar' but not + // 'Foo::baz', because the latter is declared within a different set. + }); +} +---- + +The main reason for virtual functions to be split into different sets is use of +optional kernel features in those virtual functions. It is explained in more +details in the next section. However, for simplicity purposes both properties +have aliases which allow to omit the set, thus using the default set: + +[source,dpcpp] +---- +using syclext = sycl::ext::oneapi::experimental; + +struct set_A; + +class Foo { +public: + // This virtual member function belongs to the default set of virtual + // functions. + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) + void foo() {} + + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable_in) + void bar() {} +}; + +int main() { + sycl::queue q; + // This kernel declares a use of default set of virtual functions + q.single_task(syclext::properties{syclext::assume_indirect_calls}, [=]() { + Foo *ptr = /* ... */; + ptr->bar() + + // Note: this kernel can only call 'Foo::foo' but not 'Foo::bar', because + // the latter belongs to a different (non-default) set of virtual functions. + }); +} +---- + +NOTE: By definition of the `indirectly_callable` and `assume_indirect_calls` +properties above, the type `void` is used to denote the default set of +virtual functions. Applications may also explicitly use the type `void` to +denote this default set of virtual functions when using `indirectly_callable_in` +and `assume_indirect_calls_to` properties. + +=== Optional kernel features handling + +The core SYCL specification (5.8 Attributes for device code) says the following +in the description of `device_has` attribute for SYCL kernels and non-kernel +device functions. + +When the attribute is applied to a kernel: + +> \... it causes the compiler to issue a diagnostic if the kernel (or any of the +> functions it calls) uses an optional feature that is associated with an aspect +> that is not listed in the attribute. + +When the attribute is applied to a function: + +> \... it causes the compiler to issue a diagnostic if the device function (or +> any of the functions it calls) uses an optional feature that is associated +> with an aspect that is not listed in the attribute. + +Due to dynamic nature of virtual member functions, compiler in general case is +not able to perform static analysis of a call graph in order to understand which +exact virtual functions are called from which kernels. + +Instead, information from the new properties is used by an implementation to +issue such diagnostic. When determining a set of aspects which are used by a +SYCL kernel function, an implementation must take into account all aspects which +are used by all virtual member functions included into all sets of virtual +member functions declared to be used by a kernel. + +Therefore, if only default set of virtual functions is used by an application, +it means that every kernel which is submitted with the +`assume_indirect_calls_to` property is assumed to use _all_ virtual functions +marked with the `indirectly_callable_in` property. If some of those virtual +functions use optional kernel features and there are kernels which are supposed +to work on devices without support for those optional kernel features, then +virtual functions using them should be outlined into a separate set. + +[source,dpcpp] +---- +using syclext = sycl::ext::oneapi::experimental; + +struct set_fp64; +struct set_fp16; + +struct Foo { + // This function uses 'fp64' aspect + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable_in) + void f64() { + double d = 3.14; + } + + // This function uses 'fp16' aspect + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable_in) + void f16() { + sycl::half h = 2.71f; + } +}; + +sycl::queue q; + +q.single_task(syclext::properties{syclext::assume_indirect_calls_to}, + [=]() [[sycl::device_has(sycl::aspect::fp64)]] { + // Diagnostic is required for this kernel, because it is declared as only + // using 'fp64' aspect, but it also uses virtual member functions from + // "set_fp16", which includes 'Foo::f16' that uses 'fp16' aspect. +}); + +q.single_task(syclext::properties{syclext::assume_indirect_calls_to}, + [=]() [[sycl::device_has()]] { + // Diagnostic is required for this kernel, because it is declared as not + // using any optional features, but it also uses virtual member functions from + // "set_fp64", which includes 'Foo::f64' that uses 'fp64' aspect. +}); + +q.single_task(syclext::properties{syclext::assume_indirect_calls_to}, + [=]() [[sycl::device_has(sycl::aspect::fp64)]] { + // No diagnostic is required for this kernel, because list of declared aspects + // matches list of used aspects. That includes virtual member functions from + // "set_fp64", which includes 'Foo::f64' that uses 'fp64' aspect +}); +---- + +Submitting a kernel with `assume_indirect_calls_to` property, which includes +virtual member functions that use optional kernel features to a device that +doesn't support them, should result in an exception at runtime, similar to how +it is defined by the core SYCL specification. + +[source,dpcpp] +---- +using syclext = sycl::ext::oneapi::experimental; + +struct set_A; +struct set_B; + +struct Foo { + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable_in) + void foo() { + double d = 3.14; + } + + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable_in) + void bar() {} +}; + +int main() { + sycl::queue q(/* device selector returns a device *without* fp64 support */); + assert(!q.get_device().has(sycl::aspect::fp64)); + + q.single_task(syclext::properties{syclext::assume_indirect_calls_to}, + [=]() { + // Exception is expected to be thrown, because target device doesn't support + // fp64 aspect and it is used by 'Foo::foo' which is included into 'set_A' + }); + + q.single_task(syclext::properties{syclext::assume_indirect_calls_to}, + [=]() { + // No exceptions are expected, because 'set_B' doesn't bring any + // requirements for optional kernel features. + }); +} +---- + +An implementation may not raise a compile time diagnostic or a run time +exception merely due to speculative compilation of a virtual member function for +a device when the application does not specify a use of virtual member functions +through the corresponding properties. + +[source,dpcpp] +---- +using syclext = sycl::ext::oneapi::experimental; + +struct Foo { + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) + void foo() { + double d = 3.14; + } + + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) + void bar() {} +}; + +int main() { + sycl::queue q(/* device selector choosing a device *without* fp64 support */); + assert(!q.get_device().has(sycl::aspect::fp64)); + + auto *Storage = sycl::malloc_device(1, q); + + q.single_task([=]() { + // The kernel is not submitted with 'calls_indirectly' property and + // therefore it is not considered to be using any of virtual member + // functions of 'Foo'. This means that the object of 'Foo' can be + // successfully created by this kernel, regardless of whether a target + // device supports 'fp64' aspect which is used by 'Foo::foo'. No exceptions + // are expected to be thrown. + new (Storage) Foo; + }); +} +---- + +==== Interaction with `reqd_sub_group_size` attribute + +The `reqd_sub_group_size` attribute is a bit of a special case comparing to +other optional kernel features, because it requires to compile a kernel in a +certain way, which may require special handling for all functions which are +called from it. + +When the same function is called from two or more kernels with different +`reqd_sub_group_size` attribute, it may be required for the implementation to +duplicate that function to create different versions of it tailored to different +sub-group sizes. It can be done in a straightforward manner when operating on a +static call graph. + +Virtual member functions are essentially called indirectly and pointers to them +are initialized just once when an object of a polymorphic class is being +created. Therefore, to support calling such virtual member function from two or +more kernels with different `reqd_sub_group_size`, each kernel may need to +receive a different pointer to a different version of a virtual member function. + +To avoid possibly posing such multi-versioning requirements on implementations, +virtual member functions can only be called from kernels with _primary_ +sub-group-size as defined by +link:../proposed/sycl_ext_oneapi_named_sub_group_sizes.asciidoc[ +sycl_ext_oneapi_named_sub_group_sizes] extension, or otherwise behavior is +undefined. + +NOTE: for implementations that don't support +`sycl_ext_oneapi_named_sub_group_sizes` extension, virtual member functions can +only be called from kernels which *don't* have `reqd_sub_group_size` attribute +set on them explicitly, or otherwise behavior is undefined. + +=== Kernel bundles and device images + +When an object of a polymorphic class is constructed, it stores a pointer to +virtual table, which points to its virtual member functions. Addresses of those +functions are accessible and valid only within a kernel bundle containing a +kernel which used to construct an object. + +Performing calls to virtual member functions of an object constructed in a +kernel from a different kernel bundle is an undefined behavior. + +[source,dpcpp] +---- +using syclext = sycl::ext::oneapi::experimental; + +struct Base { + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) + void foo() {} +}; + +class Constructor; +class Use; + +int main() { + sycl::queue Q; + + Base *Obj = sycl::malloc_device(1, Q); + int *Result = sycl::malloc_shared(2, Q); + + auto bundleA + = sycl::get_kernel_bundle(Q.get_context(), + {sycl::get_kernel_id()}); + auto bundleB + = sycl::get_kernel_bundle(Q.get_context(), + {sycl::get_kernel_id()}); + + + Q.submit([&](sycl::handler &CGH) { + CGH.use_kernel_bundle(bundleA); + CGH.single_task([=]() { + // Only placement new can be used within device functions. + new (Obj) Base; + }); + }); + + Q.submit([&](sycl::handler &CGH) { + CGH.use_kernel_bundle(bundleB); + CGH.single_task(syclext::properties{syclext::assume_indirect_calls}, + [=]() { + // Call to 'Base::foo' is an undefined behavior here, because 'Obj' was + // constructed within kernel bundle `bundleA` + Obj->foo(); + }); + }); + + return 0; +} +---- + +If no explicit kernel bundle operations are performed by a program, it is +responsibility of a SYCL implementation to ensure that all kernels that use +virtual functions from the same set are implicitly put together into the same +kernel bundle to ensure that everything works correctly. + +Note, however, that there are APIs which may require SYCL implementation to +re-compile a kernel bundle. For example, if a specialization constant value is +changed, SYCL implementation may need to re-compile a kernel bundle to embed +new value of a specialization constant into a device program. Such +re-compilation will invalidate all addresses of virtual functions which may +have been previously recorded in a constructed object making behavior of +virtual function calls through that object undefined. + +Correct manipulation with specialization constants in kernels that also use +virtual functions requires advanced knowledge of implementation details and +therefore it is not recommended to use specialization constants together with +virtual functions. + +== Example usage + +[source,dpcpp] +---- +#include + +using syclext = sycl::ext::oneapi::experimental; + +class Base { +public: + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) + int get_random_number() { + return 4; // Chosen by fair dice roll. Guaranteed to be random + } + + // Not considered to be a device function, can use full set of C++ features + virtual int get_host_random_number() { + throw std::runtime_error("Not Implemented"); + } +}; + +class Derived : public Base { +public: + SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) + int get_random_number() override { + return 221; + } +}; + +int main() { + sycl::queue Q; + + Base *Obj = sycl::malloc_device(1, Q); + int *Result = sycl::malloc_shared(1, Q); + + Q.single_task([=]() { + // Only placement new can be used within device functions. + new (Obj) Derived; + }); + + auto props = syclext::properties{syclext::assume_indirect_calls}; + Q.single_task(props, [=]() { + Base B; + Result[0] = B.get_random_number(); + }).wait(); + assert(Result[0] == 4); + + Q.single_task(props, [=]() { + Result[0] = Obj->get_random_number(); + }).wait(); + assert(Result[0] == 221); + + return 0; +} +---- + +== Issues + +=== Handling of `reqd_sub_group_size` attribute + +The extension allows virtual calls to be performed only from kernels with +_primary_ sub-group size, which is quite limiting and doesn't allow you to rely +on a particular sub-group size you want within a virtual function. + +This is more of an implementation limitation, rather than a language problem, +because at both SPIR-V and SYCL levels we don't have a mechanism of assigning +`reqd_sub_group_size` attribute to on-kernel SYCL functions and considering +indirect nature of virtual functions, compiler may not be able to figure out +which kernels use which exact virtual functions. + +By implementing some extra interfaces at SPIR-V and SYCL level we should be able +to improve the situation and lift some of the limitations around +`reqd_sub_group_size` attribute use together with virtual functions, but this +won't be a part of the initial language specification and implementation. + +=== Interaction with specialization constants + +Implementation of specialization constants may involve re-compilation and +therefore can easily break virtual functions functionality. Current extension +spec wording is to _discourage_ use of specialization constants together with +virtual functions, but not to completely prohibit. Should we be more clear here +maybe with the wording and make it stricter or more precise/formal? diff --git a/sycl/doc/index.rst b/sycl/doc/index.rst index 8de72c40d53e7..9fe3ef693a494 100644 --- a/sycl/doc/index.rst +++ b/sycl/doc/index.rst @@ -53,6 +53,7 @@ Design Documents for the oneAPI DPC++ Compiler design/DeviceConfigFile design/PropagateCompilerFlagsToRuntime design/SYCLNativeCPU + design/VirtualFunctions design/CommandGraph design/OffloadDesign design/PrivateAlloca diff --git a/sycl/test-e2e/VirtualFunctions/test-plan.asciidoc b/sycl/test-e2e/VirtualFunctions/test-plan.asciidoc new file mode 100644 index 0000000000000..c4a05d53cfad8 --- /dev/null +++ b/sycl/test-e2e/VirtualFunctions/test-plan.asciidoc @@ -0,0 +1,302 @@ +:sectnums: + += Test plan for virtual functions support in SYCL + +This is a test plan for virtual functions functionality described by the +`ext_sycl_oneapi_virtual_functions` extension. + +NOTE: This test plan does not cover unit tests, or negative tests related to +compiler diagnostics, it is focused on end-to-end examples to make sure that the +new functionality works as expected in different scenarios. + +== Testing scope + +=== Device coverage + +All of the tests described below are performed on a single device, which could +be any device: the feature is guarded by an aspect, so tests are expected to +exit early if a device doesn't support virtual functions. + +=== Data types coverage + +There is no need to repeat each and every test using different data types, +because it won't bring any significant improvements to tests quality. However, +for some test cases data types used in them matter more. Their description would +contain explicit requirements about data types which should be covered. + +=== Code paths coverage + +Test cases below often describe an example where behavior of a virtual member +function in a base class is overridden by derived classes. In those scenarios +test case should be repeated several times, each time taking a code path to a +different _actual_ class used under the hood. + +For example, for the following scenario: +[source,c++] +---- +class Base { +public: + virtual void foo() { /* ... */ } +}; +class Derived1 : public Base { +public: + void foo() override { /* ... */ } +}; +class Derived2 : public Base { +public: + void foo() override { /* ... */ } +}; +---- + +Test cases should be repeated to invoke both +Derived1::foo+ and ++Derived2::foo+. + +== Tests + +NOTE: Compiler will attempt to de-virtualize the program as much as possible. +Therefore, it is important that it is not statically known which exact method +of which exact class is being called in all test cases. + +=== The simplest case: create and call + +Key feature of this group of test cases is that an object of a polymorphic class +is created and used (virtual member functions of it are called) within the same +kernel. + +Tests in this category should only use default set of virtual functions. + +NOTE: Tests in this category are specifically simplified to use limited set of +available functionality in each case. The intent here is to have a sub-suite of +basic acceptance tests, which are closer to unit tests in context of being +focused on a single aspect of feature, but still being E2E tests. + +==== Virtual functions with no access to object data + +For each test in this sub-category, classes with virtual functions should not +have any data members. Virtual functions should simply return some values, +possibly based on input arguments. + +===== Simple hierarchy + +Test checks that a very basic usage model of virtual functions works. + +There is a base class with a virtual member function, which is being overridden +in several derived classes. + +Depending on a runtime parameter (passed as a kernel argument) a kernel creates +an object of either one of derived classes, or of a base class. Address of that +object is stored in a variable of type "pointer to a base class" and used within +the same kernel to call a virtual member function. Result is stored in a buffer +and verified on host. + +===== More complex hierarchy + +Test checks that derived classes can be derived further and that pointers to +objects of polymorphic classes can be passed to functions and that virtual +functions continue to work correctly. + +There is a base class with a virtual member function, which is being overridden +in a derived class (further referred as "level 1 class"). That sub-class defines +another virtual member function, which uses first virtual member function from +the base class. That second virtual member function is overridden in several +more sub-derived classes (further referred as "level 2 classes"). + +Depending on a runtime parameter (passed as a kernel argument) a kernel creates +an object of one of level 2 classes and passes it to a function accepting a +pointer to a level 1 class object. The function calls the second virtual member +function, result is stored to a buffer and verified on host. + +===== Missing overrides + +Test checks that the right functions are being called, trying different +combinations of which classes in hierarchy override virtual member function +from a base class. + +There is a base class with a few virtual member functions. There is a set of +derived classes which may themselves be parents to some other classes, building +an hierarchy of 5-7 different classes. Not all virtual member functions from +base class are overridden in every sub-class. Example of cases which are +expected to be tested: + +- `Base` defines `foo`; `Derived1` inherits `Base`; `Derived2` inherits + `Derived1`, overrides `foo` +- `Base` defines `bar`; `Derived3` inherits `Base`, overrides `bar`; + `Derived4` inherits `Derived3`; `Derived5` inherits `Derived4`, overrides + `bar` + +Depending on a runtime parameter (passed as a kernel argument) a kernel creates +an object of one of classes in the hierarchy and passes it to a function +accepting a pointer to a base object. The function calls virtual member +functions, results are stored in a buffer and verified on host. + +==== Virtual functions with access to class/object data + +Tests in this sub-category intended to check access to both static and +non-static class data members. + +===== Static data members access in a simple hierarchy + +Test checks that a static data member can be accessed through virtual member +functions from different overrides in a classes hierarchy. + +There is a base class with static data members and a virtual member function, +which is being overridden in few derived classes. All overrides of the virtual +member function access static data members of the base class. + +Depending on a runtime parameter (passed as a kernel argument) a kernel creates +an object of either one of derived classes, or of a base class. Address of that +object is stored in a variable of type "pointer to a base class" and used within +the same kernel to call a virtual member function. Result is stored in a buffer +and verified on host. + +===== Static data members access in a more complex hierarchy + +Test checks that a static data member can be accessed through virtual member +functions from different overrides in a classes hierarchy. + +There is a base class with static data members and a virtual member function, +which is being overridden in few derived classes. Some of those classes have +extra static data members and in turn may have derived classes as well. All +overrides of the virtual member function access static data member of their base +classes. + +Depending on a runtime parameter (passed as a kernel argument) a kernel creates +an object of one of derived classes and passes it to a function accepting a +pointer to a base class object. The function the virtual member function, result +is stored to a buffer and verified on host. + +===== Non-static data members access to read data in a simple hierarchy + +Test checks that virtual member functions can access non-static data members +of the current and base classes to read their values. + +There is a base class with non-static data members and a virtual member +function, which is being overridden in a few derived classes. All overrides of +the virtual member function access non-static data members described in the +base class to only read their values. + +Depending on a runtime parameter (passed as a kernel argument) a kernel creates +an object of either one of derived classes, or of a base class. Address of that +object is stored in a variable of type "pointer to a base class" and used within +the same kernel to call a virtual member function. Result is stored in a buffer +and verified on host. + +NOTE: This test case can have a variation where virtual member functions are +additionally marked as `const`. + +===== Non-static data members access to read data in a more complex hierarchy + +Test checks that non-static data members can be accessed through virtual member +functions from different overrides in a classes hierarchy. + +There is a base class with non-static data members and a virtual member +function, which is being overridden in few derived classes. Some of those +classes have extra non-static data members and in turn may have derived classes +as well. All overrides of the virtual member function access non-static data +members of their base classes. + +Depending on a runtime parameter (passed as a kernel argument) a kernel creates +an object of one of derived classes and passes it to a function accepting a +pointer to a base class object. The function the virtual member function, result +is stored to a buffer and verified on host. + +===== Handling of non-device virtual functions + +The test checks that presence of virtual member functions which were not marked +to be callable from device code can be handled correctly. + +In a simple hierarchy of classes some of virtual functions and their overrides +should be marked as callable from device, but other should not. Those virtual +functions should perform access to non-static data members. + +Depending on a runtime parameter (passed as a kernel argument) a kernel creates +an object of one of derived classes and passes it to a function accepting a +pointer to a base class object. The function the virtual member function, result +is stored to a buffer and verified on host. In device code we only check virtual +functions which were marked as callable on device. Host part of the program +also does calls to host-only virtual functions to verify their correctness. + +=== Passing objects of polymorphic classes between kernels + +Contrary to the previous section, an object of a polymorphic class is +constructed in one kernel, but used in another, which is closer to a real +examples where initialization is a separate phase of an application. + +This category also makes use of non-default sets of virtual functions, i.e. it +tests template arguments that you can pass into the new compile-time properties. + +Both USM and SYCL buffers should be used by tests as a mean of storing data and +transferring it between kernels. + +==== Single construct, single use + +Test submits two kernels: one constructs an object of a polymorphic class and +another performs virtual function calls using that object. The test should +check both default and non-default sets of virtual functions, as well as +access to object's data members. + +==== Single construct, multiple use + +In this test, different virtual functions should be put into different sets, but +there should still be a single kernel that constructs an object of a polymorphic +class. + +Then there should be a few kernels that each perform a virtual function call of +a method from a different set on that single object. + +==== Multiple construct, single use + +In this test, there should be several kernels each constructing an object of a +different derived class. It should be followed by a single kernel that calls +virtual functions using all those constructed objects. + +==== Multiple construct, multiple use + +In this test, there should be several kernels each constructing an object of a +different derived class. Those objects should have several virtual functions +each in a different set. The "construct" kernel should be followed by a few +"use" kernels each performing a virtual call of a different virtual method of +that created object. + +=== Separate translation units + +Test cases in this section aimed to cover different scenarios where definitions +of virtual functions, kernels that construct objects and kernels which perform +virtual calls are all distributed among several translation units in different +combinations. + +Test cases in this section could be a copy of test cases from the sections above +with only difference that they are split into several source files. + +==== Virtual functions defined in a separate translation unit + +For this test case, definition of virtual functions which are called from device +should be outlined into a separate translation unit, but kernels which construct +objects and perform virtual calls should all be in the same translation unit. + +==== Virtual functions defined in several translation unit + +This is the same test case as one above, except that every virtual function +definition should be placed in its individual translation unit. + +==== Kernels that use virtual functions are defined in different translation units + +For this test cases, both virtual functions and kernels that use them (including +kernels that construct objects) should be outlined into separate translation +units, i.e. there should be at least 3 translation units: + +- virtual functions definitions +- kernels that construct objects +- kernels that perform virtual calls + +=== Optional kernel features + +TBD. + +=== Misc TODOs + +Test where each work-item in a sub-group calls a different virtual function +Test that experimental::printf works within virtual functions +Test that work-group built-ins work within virtual functions. Barriers? +Test that class can have non-device virtual functions +