Skip to content

Fix GH-19065: Long match statement can segfault compiler during recursive SSA renaming #19083

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: PHP-8.3
Choose a base branch
from

Conversation

nielsdos
Copy link
Member

@nielsdos nielsdos commented Jul 9, 2025

On some systems, like Alpine, the thread stack size is small by default. The last step of SSA construction involves variable renaming that is recursive, and also makes copies of their version of the renamed variables on the stack. This combination causes a stack overflow during compilation on Alpine. Triggerable for example with very long match statements.
We previously ran into similar issues, where we also transformed to iterative algorithms, e.g. #14432

A stop-gap solution would be to use heap allocated arrays for the renamed variable list, but that would only delay the error as increasing the number of match arms increases the depth of the dominator tree, and will eventually run into the same issue.

Instead, this patch transforms the algorithm into an iterative one. There are two states stored in a worklist stack: positive numbers indicate that the block still needs to undergo variable renaming. Negative numbers indicate that the block and its dominated children are already renamed. Because 0 is also a valid block number, we bias the block numbers by adding 1.
To restore to the right variant when backtracking the "recursive" step, we index into an array pointing to the different variable renaming variants.

ALLOCA_FLAG(save_vars_use_heap);
unsigned int save_vars_top = 0;
unsigned int *save_positions = do_alloca(sizeof(unsigned int) * ssa->cfg.blocks_count, save_positions_use_heap);
int **save_vars = do_alloca(sizeof(int *) * (ssa->cfg.blocks_count + 1), save_vars_use_heap);
Copy link
Member Author

@nielsdos nielsdos Jul 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's possible to combine save_vars and save_positions somehow, but I haven't thought much of it yet. Probably via pointer tagging to indicate which arrays are reused and which are new allocations.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you could also put them together into a struct

struct {
    int *vars;
    int saved_at_top; // previous save_vars_top
}

Saves an allocation, not familiar enough with C though to know if this is actually faster

…cursive SSA renaming

On some systems, like Alpine, the thread stack size is small by default.
The last step of SSA construction involves variable renaming that is
recursive, and also makes copies of their version of the renamed
variables on the stack. This combination causes a stack overflow during
compilation on Alpine. Triggerable for example with very long match
statements.

A stop-gap solution would be to use heap allocated arrays for the
renamed variable list, but that would only delay the error as increasing
the number of match arms increases the depth of the dominator tree, and
will eventually run into the same issue.

This patch transforms the algorithm into an iterative one.
There are two states stored in a worklist stack: positive numbers
indicate that the block still needs to undergo variable renaming.
Negative numbers indicate that the block and its dominated children are
already renamed. Because 0 is also a valid block number, we bias the
block numbers by adding 1.
To restore to the right variant when backtracking the "recursive" step,
we index into an array pointing to the different variable renaming
variants.
@nielsdos nielsdos marked this pull request as ready for review July 9, 2025 20:56
@nielsdos nielsdos requested a review from dstogov as a code owner July 9, 2025 20:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Long match statement can segfault compiler during recursive SSA renaming
2 participants