Skip to content

zend_execute: Mark zend_get_executed_*() as __attribute__((pure)) #18998

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

TimWolla
Copy link
Member

@TimWolla TimWolla commented Jul 1, 2025

These functions do not modify the state of the program and depend on thread-safe global variables only.


This is a possible precursor to #18995.

@TimWolla
Copy link
Member Author

TimWolla commented Jul 1, 2025

CI Benchmark indicates a slight slowdown. Can anyone with a “reliable CPU” do another non-valgrind check to see if this actually resulted in regressions? My CPU is notoriously bad at benchmarking.

@TimWolla TimWolla requested review from iluuu1994 and nielsdos July 1, 2025 13:24
@iluuu1994
Copy link
Member

iluuu1994 commented Jul 1, 2025

❯ b 50 --mode=cycles 'sapi/cli/php-old -n benchmark/repos/symfony-demo-2.2.3/public/index.php' 'sapi/cli/php-new -n benchmark/repos/symfony-demo-2.2.3/public/index.php'
Old:       1.694 G ± 991.360 k (0.059%) 
New:       1.696 G ± 1.041 M (0.061%)
Diff:      2.264 M (0.134%)
T-test:    5.679
P-value:   0.001

First test looks to be slower.

Edit: Second run approx. confirms.

❯ b 50 --mode=cycles 'sapi/cli/php-old -n benchmark/repos/symfony-demo-2.2.3/public/index.php' 'sapi/cli/php-new -n benchmark/repos/symfony-demo-2.2.3/public/index.php'
Old:       1.697 G ± 1.114 M (0.066%)   
New:       1.698 G ± 957.368 k (0.056%)
Diff:      1.337 M (0.079%)
T-test:    3.282
P-value:   0.008

@TimWolla
Copy link
Member Author

TimWolla commented Jul 1, 2025

image

@iluuu1994
Copy link
Member

Can relate

@TimWolla TimWolla force-pushed the zend-get-executed-scope branch from b26e8a6 to 2fb856e Compare July 2, 2025 07:01
@nielsdos
Copy link
Member

nielsdos commented Jul 2, 2025

Coming from your other PR, I see now why you submitted this.
I don't know why this worsens performance, but indeed the benchmark seems to indicate now a deviation higher than noise.
It could be many different reasons, from alignment of functions/loops to bad compiler spill/rematerialization decisions...

@iluuu1994
Copy link
Member

Given that this PR seems to lead to a unclear performance degradation, I'd prefer not to merge it. We can investigate why it happens, and potentially create a bug report if we can identify a compiler-related culprit. It may also be alignment-related, as Niels mentioned. I have thought about a way to align the function offsets of two binaries when I encountered this issue myself, given that -falign-functions and friends didn't help. One way might be to build the binaries, inspect the elf-file and calculate the necessary nops to insert, and then use llvm-bolt's --pad-funcs-before flag. Not sure whether that will work. Suggestions are welcome.

@iluuu1994
Copy link
Member

Also, for completion's sake: I tried to apply pure and const with an automated script last year:
master...iluuu1994:php-src:script-fix-const-pure-attributes

The entire PR showed a big degradation, although I don't remember by how much nor how I measured it. It would ofc be great if this unlocked some optimization opportunities. In that case, it might be better to go the automated route, rather than applying it bit-by-bit by hand.

TimWolla added 2 commits July 9, 2025 12:11
These functions do not modify the state of the program and depend on
thread-safe global variables only.
@TimWolla TimWolla force-pushed the zend-get-executed-scope branch from edf10a4 to c78e0d4 Compare July 9, 2025 10:11
@TimWolla
Copy link
Member Author

TimWolla commented Jul 9, 2025

I've rebased this one a final time and now the Symfony Benchmark indicates an improvement 🤷

@iluuu1994
Copy link
Member

I will re-run my benchmarks later today.

@nielsdos
Copy link
Member

nielsdos commented Jul 9, 2025

Let's wait for Ilija's benchmark, but I'm in favour of doing this.

@iluuu1994
Copy link
Member

iluuu1994 commented Jul 9, 2025

Based on cycles counted by perf, I still see a significant regression on Symfony Demo:

// Two separate runs
Δ = 1.173 M (+0.223%, p < 0.001)
Δ = 1.040 M (+0.198%, p < 0.001)

And a significant improvement on Zend/bench.php:

// Two separate runs
Δ = -7.186 M (-0.649%, p < 0.001)
Δ = -8.979 M (-0.812%, p < 0.001)

And a huge regression on Zend/micro_bench.php:

// Two separate runs
Δ = 138.389 M (+2.754%, p < 0.001)
Δ = 120.192 M (+2.390%, p < 0.001)

It does seem like this is most likely layout related. It would be really great to invest some time into this idea to hopefully solve this once and for all. Otherwise we keep wasting hours with people benchmarking and following false leads, or potentially dropping good changes. Alternatively we might try to run LLVM bolt on both binaries, but it's hard to tell how fair this comparison will be. I have now posed this question on the LLVM #bolt Discord channel. If you have other ideas, please share.

@TimWolla TimWolla marked this pull request as ready for review July 10, 2025 06:46
@TimWolla TimWolla requested a review from dstogov as a code owner July 10, 2025 06:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants