Skip to content

Add FastCV DSP Initialization, QcAllocator and FastCV DSP Extension APIs #3931

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: 4.x
Choose a base branch
from

Conversation

quic-apreetam
Copy link

@quic-apreetam quic-apreetam commented Apr 29, 2025

Detailed Description

This PR introduces FastCV DSP Extension APIs within the 'cv::fastcv::dsp' namespace.
The following APIs have been added:

  1. fcvdspinit: Initializes the FastCV DSP environment.
  2. fcvdspdeinit: Deinitializes the FastCV DSP environment.
  3. sumOfAbsoluteDiffs: Computes the sum of absolute differences of an image against an 8x8 template.
  4. thresholdOtsu: Binarizes a grayscale image using Otsu's method.
  5. FFT: Computes the 1D or 2D Fast Fourier Transform of a real-valued matrix.
  6. IFFT: Computes the 1D or 2D Inverse Fast Fourier Transform of a complex-valued matrix.
  7. canny: Applies the Canny edge detector to an 8-bit grayscale image.
  8. filter2D: Applies a generic 2D filter to an image.

The QcAllocator has been added to manage memory allocations on Qualcomm's Chipsets. This allocator ensures that matrices are allocated using the Qualcomm hardware memory allocator, providing efficient DSP operations.

Requires binary from opencv/opencv_3rdparty#95

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

* @param apertureSize The Sobel kernel size for calculating gradient. Supported sizes are 3, 5 and 7.
* @param L2gradient L2 Gradient or L1 Gradient
*/
CV_EXPORTS_W void canny(InputArray _src, OutputArray _dst, int lowThreshold, int highThreshold, int apertureSize = 3, bool L2gradient = false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Canny is name. Please use capital letter.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will update

if(data0)
u->flags |= cv::UMatData::USER_ALLOCATED;

u->userdata = new std::string("QCOM");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The solution leads to memory leak. Something like u->userdata = "QCOM"; should be enough.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will compare Mat::allocator field with DSP allocator address, I will remove this

if (u->userdata)
{
delete static_cast<std::string*>(u->userdata);
u->userdata = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String is not freed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will compare Mat::allocator field with DSP allocator address, I will remove this as it won't be needed

Comment on lines +117 to +125
CV_Assert(CV_MAT_DEPTH(kernel.type()) == CV_8S);
parallel_for_(Range(0, src.rows), FcvFilter2DLoop_Invoker(src, dst, kernel), nStripes);
break;
}
case CV_32F:
{
CV_Assert(CV_MAT_DEPTH(kernel.type()) == CV_32F);
parallel_for_(Range(0, src.rows), FcvFilter2DLoop_Invoker(src, dst, kernel), nStripes);
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DSP is a shared resource. Are you sure that parallel_for_ makes sense here? Also I would say that we need parallel scheme according to amount of DSP cores/units rather than CPU cores.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove the parallel_for_ for now, we have parallelization within DSP API.

Comment on lines +81 to +84
#define IS_FASTCV_ALLOCATED(mat) \
((mat.u && mat.u->userdata && \
*static_cast<std::string*>(mat.u->userdata) == "QCOM") ? true : \
(CV_Error(cv::Error::StsBadArg, cv::format("Matrix '%s' not allocated with FastCV allocator. " \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just compare Mat::allocator field with DSP allocator address. User data is redundant.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, thanks for the input

Comment on lines +38 to +39
// Check DSP initialization status and initialize if needed
FASTCV_CHECK_DSP_INIT();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say, that it should be at the very beginning of the function. If IS_FASTCV_ALLOCATED returned true, then DSP is already initialized, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case the user fails to allocate the buffer using QcAllocator then we do not need to Check or Initialize the DSP. IS_FASTCV_ALLOCATED does not mean that the DSP is initialized. DSP will be initialized only when fcvdspinit is called or during FASTCV_CHECK_DSP_INIT();

@asmorkalov
Copy link
Contributor

Java bindings generator does not handle namespaces well:

[255/3358] Generating opencv-4120.jar
FAILED: CMakeFiles/dephelper/opencv_java_jar bin/opencv-4120.jar 
cd /home/ci/build/modules/java/jar/opencv && /bin/ant -noinput -k jar && /usr/bin/cmake -E touch /home/ci/build/CMakeFiles/dephelper/opencv_java_jar
Buildfile: /home/ci/build/modules/java/jar/opencv/build.xml

jar:
    [javac] Compiling 299 source files to /home/ci/build/modules/java/jar/opencv/build/classes
    [javac] /home/ci/build/modules/java/jar/opencv/java/org/opencv/fastcv/Fastcv.java:715: error: method filter2D(Mat,Mat,int,Mat) is already defined in class Fastcv
    [javac]     public static void filter2D(Mat _src, Mat _dst, int ddepth, Mat _kernel) {
    [javac]                        ^
    [javac] /home/ci/build/modules/java/jar/opencv/java/org/opencv/fastcv/Fastcv.java:798: error: method FFT(Mat,Mat) is already defined in class Fastcv
    [javac]     public static void FFT(Mat src, Mat dst) {
    [javac]                        ^
    [javac] /home/ci/build/modules/java/jar/opencv/java/org/opencv/fastcv/Fastcv.java:818: error: method IFFT(Mat,Mat) is already defined in class Fastcv
    [javac]     public static void IFFT(Mat src, Mat dst) {
    [javac]                        ^
    [javac] Note: Some input files use or override a deprecated API.
    [javac] Note: Recompile with -Xlint:deprecation for details.
    [javac] 3 errors
Target 'jar' failed with message 'Compile failed; see the compiler error output for details.'.

BUILD FAILED

If I understand correctly, DSP part is not usable from Java code (no way to manage allocators in Java). I propose to replace CV_EXPORTS_W with CV_EXPORTS.

@quic-apreetam
Copy link
Author

Java bindings generator does not handle namespaces well:

[255/3358] Generating opencv-4120.jar
FAILED: CMakeFiles/dephelper/opencv_java_jar bin/opencv-4120.jar 
cd /home/ci/build/modules/java/jar/opencv && /bin/ant -noinput -k jar && /usr/bin/cmake -E touch /home/ci/build/CMakeFiles/dephelper/opencv_java_jar
Buildfile: /home/ci/build/modules/java/jar/opencv/build.xml

jar:
    [javac] Compiling 299 source files to /home/ci/build/modules/java/jar/opencv/build/classes
    [javac] /home/ci/build/modules/java/jar/opencv/java/org/opencv/fastcv/Fastcv.java:715: error: method filter2D(Mat,Mat,int,Mat) is already defined in class Fastcv
    [javac]     public static void filter2D(Mat _src, Mat _dst, int ddepth, Mat _kernel) {
    [javac]                        ^
    [javac] /home/ci/build/modules/java/jar/opencv/java/org/opencv/fastcv/Fastcv.java:798: error: method FFT(Mat,Mat) is already defined in class Fastcv
    [javac]     public static void FFT(Mat src, Mat dst) {
    [javac]                        ^
    [javac] /home/ci/build/modules/java/jar/opencv/java/org/opencv/fastcv/Fastcv.java:818: error: method IFFT(Mat,Mat) is already defined in class Fastcv
    [javac]     public static void IFFT(Mat src, Mat dst) {
    [javac]                        ^
    [javac] Note: Some input files use or override a deprecated API.
    [javac] Note: Recompile with -Xlint:deprecation for details.
    [javac] 3 errors
Target 'jar' failed with message 'Compile failed; see the compiler error output for details.'.

BUILD FAILED

If I understand correctly, DSP part is not usable from Java code (no way to manage allocators in Java). I propose to replace CV_EXPORTS_W with CV_EXPORTS.

sure @asmorkalov , will change CV_EXPORTS_W with CV_EXPORTS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants