Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switching on CNTVCT for ThunderX2 platform and setting some optimization options. #150

Open
wants to merge 31 commits into
base: master
Choose a base branch
from

Conversation

aur-ml
Copy link

@aur-ml aur-ml commented Jul 31, 2018

Tested on ThunderX2 servers: the performance on 2^20 i-FFT problem (ib1048576) improves for about 20-25%.
The proposed changes does not influence other hardware architectures.

@aur-ml aur-ml changed the title Switching on CNTVCT for ThunderX platform and setting some optimization options. Switching on CNTVCT for ThunderX2 platform and setting some optimization options. Aug 30, 2018
AX_CHECK_COMPILER_FLAGS(-fsched-critical-path-heuristic, CFLAGS="$CFLAGS -fsched-critical-path-heuristic")
AX_CHECK_COMPILER_FLAGS(-fsel-sched-pipelining, CFLAGS="$CFLAGS -fsel-sched-pipelining")
AX_CHECK_COMPILER_FLAGS(-fselective-scheduling, CFLAGS="$CFLAGS -fselective-scheduling")
AX_CHECK_COMPILER_FLAGS(-ftrapping-math, CFLAGS="$CFLAGS -ftrapping-math")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all of these flags necessary for the performance improvements, or is it only one or two of them that really matters?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all of these flags necessary for the performance improvements, or is it only one or two of them that really matters?

Among about a hundred of checked flags only those were left, that have statistically significant influence on performance for the target platform.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are charts with several benchFFT tests for optimized vs default versions:
https://github.com/aur-ml/fftw3/wiki

@ashwinyes
Copy link

@stevengj Request you to please review the code.

Please merge if the the PR looks OK. Else please let @aur-ml know if there are any changes that needs to be made.

Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants