DeadArgumentElimination: Skip unprofitable single-call chains #8072
+68
−26
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
If a codebase has a long chain of single calls
then we can end up in a very slow and unprofitable situation, removing
params from c's call to d, which then means c does not use some params,
and then we go back and so forth. Each step back requires a full scan of
the code. We could develop a more sophisticated IR to handle this, but it
would need to track that the local.get of the incoming param is only used
in an outgoing param, etc., which is not trivial (1), and these chains are
unprofitable in another way: single calls like this are inlined anyhow,
making this work redundant.
For simplicity, this PR detects the most obvious case of such a chain -
that an iteratiion of work only found a single single-caller call to remove
params from - and stops removing params from that point.
This makes this pass 30% faster on a large Dart testcase, which makes
-O39% faster. On other large wasm files I see much smaller benefits,but it helps sometimes there too.
This is not truly NFC as while inlining will handle this case after, we do
alter the order a bit, leading to slightly different outputs sometimes. (The
changes are not better or worse, just noise.)
Diff without whitespace is smaller.
(1) We would need to track such things through all the other
operations this pass does, like optimize. I tinkered with this in various
ways but it ends up far more complex. I also tried other options here:
inlining is fast, but on large C++ cases the reverse is true, leading to
regressions.
find something, rather than wait for a whole new iteration of the pass.
This is complex, though, as we need to update the IR as we go.