C++11 header library for CPU/GPU parallel computing uses native compiler and OpenCL.
Example:
git clone --recursive https://github.com/fcf-framework/fcfParallel.git
The simple example of overlaying blur in a BMP file. The example is available in the repository https://github.com/fcf-framework/fcfParallelExamples.git
impl.cpp file
Since the library is implemented in the form of header files, first you need to announce its implementation.
To do this, you need to connect the header file fcfParallel/parallel.hpp
with the declared macro FCF_PARALLEL_IMPLEMENTATION
(only for fcfParallel library) or FCF_IMPLEMENTATION
(for all libraries).
It is advisable to do this in a separate file so as not to reassemble each time.
#define FCF_IMPLEMENTATION
#include <fcfImage/image.hpp>
#include <fcfParallel/parallel.hpp>
main.cpp file
We now proceed to the main programm.
Declaring a handler function
The parallel computation subroutine is declared the macro FCF_PARALLEL_UNIT. The first parameter is the name of the unit, the second is an options (JS/JSON object) and the third will be the code for the implementation of the action.
The subroutine should contain a main function called FCF_PARALLEL_MAIN. The first argument of this function should be a pointer to the FCFParallelTask
structure, which contains progress information. The rest of the argument is set by the developer.
When transferring arguments by pointer from the main program to FCF_PARALLEL_MAIN
, it is necessary to use the macro FCF_PARALLEL_GLOBAL
when declaring them, which is analogous to the specificator __global__
of the OpenCL compiler.
Launching parallel computing
To run parallel calculations, the fcf:::Parallel::Executor
object is used.
To perform the action, call the operator() method.
The first argument is a reference to the object fcf:::Parallel::Call
, which contains action parameters. The remaining arguments correspond to the 1-N arguments of the FCF_PARALLEL_MAIN
function.
Transfer of arguments to operator() method:
-
The transfer of the argument by value is done simply by transferring the value.
-
If the argument of the function
FCF_PARALLEL_MAIN
is a pointer and does not require a calculation result, then you need to use the functionfcf:::Parallel:::refArg
to transfer data. The source data can be either an pointer or an objectstd::vector
. -
If the argument is a pointer in the memory of which the result of the calculation is recorded, then the function
fcf::::Parallel:::refArg
should contain the following parameters when transmitting the argument:fcf::Parallel::refArg( outputRGB, // Indicates that after performing calculations, // you need to unload the result fcf::Parallel::ArgUpload(true), // When enabling upload, you must specify // the split parameter. // This parameter indicates that the data during unloading is divided // between devices, and their size corresponds to the number of iterations. fcf::Parallel::ArgSplit(fcf::Parallel::PS_FULL), // Number of elements per iteration fcf::Parallel::ArgSplitSize(3) )
The example of the program itself is below.
#include <iostream>
#include <fcfImage/image.hpp>
#include <fcfParallel/parallel.hpp>
//
// Pixel processing function performed on CPU/GPU
//
FCF_PARALLEL_UNIT(
blur,
{},
void FCF_PARALLEL_MAIN(const FCFParallelTask* a_task,
int a_blur,
int a_width,
int a_height,
FCF_PARALLEL_GLOBAL const char* a_source,
FCF_PARALLEL_GLOBAL char* a_result) {
int offset = a_task->lowIndex * 3;
int y = a_task->lowIndex / a_width;
int x = a_task->lowIndex % a_width;
int begby = max(y - a_blur, 0);
int endby = min(y + a_blur + 1, a_height);
int begbx = max(x - a_blur, 0);
int endbx = min(x + a_blur + 1, a_width);
int c = (endby - begby) * (endbx - begbx);
for(int channel = 0; channel < 3; ++channel) {
int value = 0;
for(int by = begby; by < endby; ++by) {
for(int bx = begbx; bx < endbx; ++bx) {
int bRawIndex = (by * a_width + bx) * 3;
value += (int)(unsigned char)a_source[bRawIndex + channel];
}
}
a_result[offset + channel] = (char)(value / c);
}
}
)
void printHelp(){
std::cout << "An example application illustrating the use of fcfParallel" << std::endl;
std::cout << " Application launch format: parallel-example-001-blur SOURCE_BMP_FILE OUTPUT_BMP_FILE" << std::endl;
std::cout << " Options:" << std::endl;
std::cout << " SOURCE_BMP_FILE - Source BMP file." << std::endl;
std::cout << " OUTPUT_BMP_FILE - Resulting BMP file with the blur effect applied." << std::endl;
}
int main(int a_argc, char* a_argv[]){
//
// Processing command line arguments
std::string sourceFilePath;
std::string outputFilePath;
for(int i = 1; i < a_argc; ++i) {
if (!std::strcmp(a_argv[i], "-h") || !std::strcmp(a_argv[i], "--help")) {
printHelp();
return 0;
} else if (sourceFilePath.empty()) {
sourceFilePath = a_argv[i];
} else if (outputFilePath.empty()) {
outputFilePath = a_argv[i];
}
}
if (sourceFilePath.empty() || outputFilePath.empty()) {
std::cout << "Incorrent command line parameters. Use --help option for got help." << std::endl;
return 1;
}
//
// Loading BMP image from file
std::vector<char> sourceRGB;
size_t sourceRGBWidth;
size_t sourceRGBHeight;
try {
fcf::Image::loadRGBFromBmpFile(sourceFilePath, sourceRGB, sourceRGBWidth, sourceRGBHeight);
} catch(std::exception& e){
std::cerr << "Invalid load BMP file: " << e.what() << std::endl;
return 1;
}
// Result image
std::vector<char> outputRGB(sourceRGB.size());
//The object that will contain debugging information
fcf::Union state;
//
// Performing parallel calculations
try {
// Object initialization
fcf::Parallel::Executor executor;
executor.initialize();
fcf::Parallel::Call call;
// Unit name to execute declared macro FCF_PARALLEL_UNIT
call.name = "blur";
// Number of iterations
call.size = sourceRGBWidth * sourceRGBHeight;
// Specify the object in which you will need to record debugging information
call.state = &state;
// Running parallel computing
executor(call,
(unsigned int)5,
(unsigned int)sourceRGBWidth,
(unsigned int)sourceRGBHeight,
fcf::Parallel::refArg(sourceRGB),
fcf::Parallel::refArg(outputRGB,
fcf::Parallel::ArgSplit(fcf::Parallel::PS_FULL),
fcf::Parallel::ArgUpload(true),
fcf::Parallel::ArgSplitSize(3)
)
);
} catch(std::exception& e) {
std::cerr << "Error in performing parallel calculations: " << e.what() << std::endl;
return 1;
}
//
// Record the result in a BMP file
try {
fcf::Image::writeRGBToBmpFile(outputFilePath, outputRGB, sourceRGBWidth, sourceRGBHeight);
} catch(std::exception& e){
std::cerr << "Invalid write BMP file: " << e.what() << std::endl;
return 1;
}
//
// Displaying debugging information
std::cout << "Time spent on implementation: " << ((double)state["duration"]/(1000*1000*1000)) << " sec" << std::endl;
std::cout << "Actions performed on the following devices: " << std::endl;
for(fcf::Union& dev : state["devices"]) {
std::cout << " Engine: "<< dev["engine"] << "; Device: " << dev["device"] << std::endl;
}
return 0;
}
CMakeLists.txt file (Build)
The build parameters are presented on the basis of CMake.
In order to build this example, you will need to link and include OpenCL. (https://github.com/KhronosGroup/OpenCL-SDK).
find_package(OpenCL REQUIRED)
include_directories(${OpenCL_INCLUDE_DIR})
include_directories(${CMAKE_SOURCE_DIR}/libraries)
add_executable("blur" impl.cpp main.cpp)
target_link_libraries("blur" ${OpenCL_LIBRARY})