Skip to content

Conversation

@ChrisRackauckas
Copy link
Member

This should reduce precompilation

This should reduce precompilation
Copy link

@ai-maintainer ai-maintainer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generating a Review...

@topolarity
Copy link
Contributor

Shaves off 10-15s of pre-compilation time on my machine (from ~4m7s)

Master trace: https://topolarity.github.io/trace-viewer/?trace=https%3A%2F%2Fraw.githubusercontent.com%2Ftopolarity%2Ftracy-traces%2Fdump%2Fdump%2Fmaster-OrdinaryDiffEq-cc46ec8b.tracy&size=2713067
With this PR: https://topolarity.github.io/trace-viewer/?trace=https%3A%2F%2Fraw.githubusercontent.com%2Ftopolarity%2Ftracy-traces%2Fdump%2Fdump%2Finit_dt_split-OrdinaryDiffEq-1328a0c3.tracy&size=2683730

Of the four-ish minutes, dominate components are:

  • module-level opt/codegen (1m30s)
  • per-function JIT (1m6s)
  • inference (44s)
  • lowering (22s)

Here are the heaviest per-function JIT hitters after this PR (in parentheses is the number of specializations):
image

@topolarity
Copy link
Contributor

Inference time is >80% dominated by 64 specializations of solve (which later gets split into all the JIT-ed functions above):
image

That var"#lorenz#673" jumps out to me. Seems like we should not be specializing on the system closure function?

@ChrisRackauckas
Copy link
Member Author

solve will specialize, but then internally it'll put a function wrapper on it and then it should stop specializing on it from that point?

@topolarity
Copy link
Contributor

solve will specialize, but then internally it'll put a function wrapper on it and then it should stop specializing on it from that point?

I guess we'd expect future inference times for solve() with other systems to be much lower then, since they'll re-use callee results on the wrapper type (even though the entrypoint will be unique for each system)

In that case, my first-glance takeaway from the JIT list is that no single function is dominating the pre-compilation time (perform_step! is the heaviest hitter and accounts for only ~15% of the JIT time). Which means we either need to reduce the specialization N for a large number of these functions, or improve things upstream (e.g. by parallelizing this work) if we want to make a bigger dent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants