Skip to content

Releases: tugrul512bit/Cekirdekler

v1.2.3

12 May 00:46
Compare
Choose a tag to compare

Multiple kernel names in "device to device pipeline stages" can be grouped with "@" separator instead of " ",",",";" separators so they read inputs from host only once before first kernel and write output to host only once after last kernel. Without "@", each kernel reads and writes inputs and outputs, making "multiple kernel stage" slower.

"@" separated kernels run as a single kernel so all use single global-local range value.

"a@b@c" N : 256 1 read 1 write

"a b@c" N,M : 256,128 2 reads 2 writes

v1.2.2

11 May 13:07
Compare
Choose a tag to compare

Now device to device pipeline stages can be initialized in buildPipeline() method with the parameters given in this method:

stage.initializerKernel("kernelName", new int[] { N }, new int[] { 256 });

v1.2.1

10 May 18:42
Compare
Choose a tag to compare

decreased command queue consumption per "device to device pipeline stage" to have room for more stages.

v1.2.0_hotfix

09 May 21:54
Compare
Choose a tag to compare

fixed: "result ready" boolean value returned by pipeline.pushData() is true at wrong iteration

v1.2.0

09 May 19:21
Compare
Choose a tag to compare

Compatible only with CekirdeklerCPP v1.2.0+

  • Removed "Copy Memory" dependency,

  • Added device to device pipelining feature

  • Added helper methods normalizedGlobalRangesOfDevices(int id) and normalizedComputePowersOfDevices() in number cruncher class for querying from client code.

v1.1.9

16 Apr 20:42
Compare
Choose a tag to compare

array of user-defined structs can be wrapped by ClArray of type byte with its static method wrapArrayOfStructs

for example, Unity's Vector3

v1.1.8

15 Apr 11:17
Compare
Choose a tag to compare

Fixed unnecessary array bounds checking for read-only buffers. Those don't need a minimum length. They are copied as a whole at once.

Added minor documentation.

Added base of built-in auxilliary kernel function example (but not working yet)

v1.1.7

13 Apr 17:38
Compare
Choose a tag to compare
  • added nbody(benchmark based) device reordering to enable picking fastest nbody performing device:

         Cekirdekler.Hardware.ClPlatforms platforms = Cekirdekler.Hardware.ClPlatforms.all();
         var selectedDevices = platforms.platformsAmd().devicesAmd().devicesWithHighestDirectNbodyPerformance();
    
  • fixed minor bug in load balancer that triggered when number of devices are changed.

  • added minor documentation

v1.1.6_hotfix

12 Apr 22:05
Compare
Choose a tag to compare

Deleted console size changing codes from platform / device info output methods. Those were not working for non-console apps.

v1.1.6

11 Apr 14:29
Compare
Choose a tag to compare

English translation of cluster computing related classes complete.

fixed some class file names.