Skip to content

Performance analysis #176

@Ptival

Description

@Ptival

llvm-disasm is noticeably slower than its LLVM counterpart. This issue will attempt to keep track of data regarding this performance discrepancy and attempts at improving the situation.

This is the current flame graph of running llvm-disasm on a ~30MB PX4 bitcode file generated by Clang 12:

image

There is an almost even divide between the time spent parsing, and the time spent pretty-printing.

Parsing

image

It seems like one third of the time is spent gobbling bytes from the input stream. Perhaps one could check whether there is something inherently inefficient in the way we consume input.

The other two thirds seem dominated by parseModuleBlock:

image

From left to right, the columns correspond to Data.LLVM.BitCode.IR.Function.finalizeStmt, Data.Generics.Uniplate.Internal.Data.descendBiData, and Data.LLVM.BitCode.IR.Function.parseFunctionBlockEntry.

Pretty-printing

(NOTE: this graph has been obtained with a branch where I replaced the pretty package with prettyprinter, to see if it made a difference. It does not do much on its own.)

image

On the pretty-printing side, we only see details of two cost centres: Prettyprinter.Render.Text.renderIO, and Text.LLVM.PP.ppModule.

The former seems unavoidable, the latter is decomposed as such:

image

I haven't looked into this yet, but those ppDebugLoc' and ppDebugInfo' sure make things slow, I wonder if there's something to look into there.

But, going back up a step, the pretty-printing flame graph also had this huge, unlabeled portion. My guess is that it likely corresponds to Prettyprinter.layoutPretty, since this is definitely called, doesn't appear elsewhere, and is likely heavy in computation. I'm not sure why it is not reported though.

As mentioned, I tried to replace pretty with prettyprinter, and it's not much faster, even with an unbounded-width layout. It definitely goes faster when using renderCompact, but that one has very ugly output. I wonder if there's something we can do in between, given there's barely any interesting pretty-printing going on in our output.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions