Skip to content

Commit 3f922c3

Browse files
authored
DeadArgumentElimination: Skip unprofitable single-call chains (#8072)
If a codebase has a long chain of single calls a -> b -> c -> d then we can end up in a very slow and unprofitable situation, removing params from c's call to d, which then means c does not use some params, and then we go back and so forth. Each step back requires a full scan of the code. We could develop a more sophisticated IR to handle this, but it would need to track that the local.get of the incoming param is only used in an outgoing param, etc., which is not trivial, and these chains are unprofitable in another way: single calls like this are inlined anyhow, making this work redundant. For simplicity, this PR detects the most obvious case of such a chain - that an iteratiion of work only found a single single-caller call to remove params from - and stops removing params from that point. This makes this pass 30% faster on a large Dart testcase, which makes -O3 9% faster. On other large wasm files I see much smaller benefits, but it helps sometimes there too. This is not truly NFC as while inlining will handle this case after, we do alter the order a bit, leading to slightly different outputs sometimes. (The changes are not better or worse, just noise.)
1 parent 5efa505 commit 3f922c3

File tree

1 file changed

+68
-26
lines changed

1 file changed

+68
-26
lines changed

src/passes/DeadArgumentElimination.cpp

Lines changed: 68 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -233,6 +233,23 @@ struct DAE : public Pass {
233233
// and then use that possibly-over-approximating data.
234234
std::vector<std::vector<Name>> callers;
235235

236+
// A count of how many iterations we saw unprofitable removals of parameters.
237+
// An unprofitable removal is one where we only manage to remove from a single
238+
// call, that is, from one call target and it has a single call going to it.
239+
// Such calls are not very interesting, as when there is a single call like
240+
// that then inlining will handle it anyhow, in most cases, and inlining
241+
// does so far more efficiently in situations of call chains:
242+
//
243+
// a -> b -> c -> d
244+
//
245+
// Imagine we remove a param from d, and so we remove it from the call in c.
246+
// If c received that as a parameter, and only ever used it to call d, then
247+
// now we can remove a param from c, and from the call in b, and so forth -
248+
// requiring a full iteration each time to find the small amount of progress.
249+
// (Inlining, otoh, will inline b into a, then c into a, and d into a,
250+
// efficiently.)
251+
Index unprofitableRemovalIters = 0;
252+
236253
bool iteration(Module* module, DAEFunctionInfoMap& infoMap) {
237254
allDroppedCalls.clear();
238255

@@ -381,33 +398,58 @@ struct DAE : public Pass {
381398
ReFinalize().run(getPassRunner(), module);
382399
}
383400
// We now know which parameters are unused, and can potentially remove them.
384-
for (Index index = 0; index < numFunctions; index++) {
385-
auto* func = module->functions[index].get();
386-
if (func->imported()) {
387-
continue;
388-
}
389-
if (hasUnseenCalls[index]) {
390-
continue;
391-
}
392-
auto numParams = func->getNumParams();
393-
if (numParams == 0) {
394-
continue;
395-
}
396-
auto& calls = allCalls[index];
397-
if (calls.empty()) {
398-
continue;
399-
}
400-
auto name = func->name;
401-
auto [removedIndexes, outcome] = ParamUtils::removeParameters(
402-
{func}, infoMap[name].unusedParams, calls, {}, module, getPassRunner());
403-
if (!removedIndexes.empty()) {
404-
// Success!
405-
worthOptimizing.insert(func);
406-
markStale(name);
407-
markCallersStale(index);
401+
// Only do so if we didn't run into unprofitable removals - if so, leave
402+
// any further removals for other invocations of this pass. (This avoids us
403+
// getting stuck in long unprofitable call chains as mentioned in the
404+
// comment earlier; note that we do process one unprofitable iteration
405+
// before giving up here, so we do make progress at least.)
406+
if (!unprofitableRemovalIters) {
407+
Index removals = 0;
408+
Index singleCallerRemovals = 0;
409+
for (Index index = 0; index < numFunctions; index++) {
410+
auto* func = module->functions[index].get();
411+
if (func->imported()) {
412+
continue;
413+
}
414+
if (hasUnseenCalls[index]) {
415+
continue;
416+
}
417+
auto numParams = func->getNumParams();
418+
if (numParams == 0) {
419+
continue;
420+
}
421+
auto& calls = allCalls[index];
422+
if (calls.empty()) {
423+
continue;
424+
}
425+
auto name = func->name;
426+
auto [removedIndexes, outcome] =
427+
ParamUtils::removeParameters({func},
428+
infoMap[name].unusedParams,
429+
calls,
430+
{},
431+
module,
432+
getPassRunner());
433+
if (!removedIndexes.empty()) {
434+
// Success!
435+
worthOptimizing.insert(func);
436+
markStale(name);
437+
markCallersStale(index);
438+
if (calls.size() == 1) {
439+
singleCallerRemovals++;
440+
}
441+
removals++;
442+
}
443+
if (outcome == ParamUtils::RemovalOutcome::Failure) {
444+
callTargetsToLocalize.insert(name);
445+
}
408446
}
409-
if (outcome == ParamUtils::RemovalOutcome::Failure) {
410-
callTargetsToLocalize.insert(name);
447+
if (removals == 1 && singleCallerRemovals == 1 &&
448+
callTargetsToLocalize.empty()) {
449+
// We only removed parameters from one function, and it had a single
450+
// caller, and we don't have other pending actions (call targets we
451+
// need to localize), so this was unprofitable as mentioned earlier.
452+
unprofitableRemovalIters++;
411453
}
412454
}
413455
// We can also tell which calls have all their return values dropped. Note

0 commit comments

Comments
 (0)