-
Notifications
You must be signed in to change notification settings - Fork 3
Quantify the loss of functionality at each translation step #62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Notes from sync today:
Can answering the question of "do we lift correctly" ultimately be automated, maybe through differential testing between the resulting binaries? |
Looks like PulseOX builds but Bloodlight doesn't, on the |
Looks like Bloodlight didn't build by default but it did once I changed the build cmd in the script for it locally a bit to be docker run --rm \
-v "/home/kellykaoudis/patchestry/firmwares/repos/bloodlight-firmware":"/work/bloodlight-firmware" \
-v "/home/kellykaoudis/patchestry/firmwares/output":"/output" \
firmware-builder bash -c "git config --global --add safe.directory /work/bloodlight-firmware && \
cd bloodlight-firmware && \
make -C firmware/libopencm3 && \
make -C firmware -j8 && \
cp firmware/bloodlight-firmware.elf /output/bloodlight-firmware.elf" I'm not a huge fan of using a build container with a single command like this since it's kinda messy (3-4 cmds that could be separate RUN commands in a Dockerfile instead of "loose" in a string executed by a bash shell created from another bash script like this), would prefer to have one base and two child builds that inherit from it. Might clean that up honestly, this is annoying since it's not oneshot and git yelled about ambiguous ownership, but might also be I'm the only one with such an issue. |
In decompiling to pcode, I observed the following errors for pulseox: ./scripts/ghidra/decompile-headless.sh --input firmwares/output/pulseox-firmware.elf --output firmwares/output/pulseox.pcode yielded, among output that seemed correct,
|
I also observed errors for bloodlight, but much different: ./scripts/ghidra/decompile-headless.sh --input firmwares/output/bloodlight-firmware.elf --output firmwares/output/bloodlight.pcode
and ultimately
|
Lifting seems like it maybe worked for pulseox, the following command made a .cir and a .c file where directed for pulseox: builds/default/tools/pcode-lifter/Debug/pcode-lifter --input firmwares/output/pulseox.pcode --emit-cir --output firmwares/output/pulseox-lifted --print-tu |
Butttttt bloodlight has some problems: builds/default/tools/pcode-lifter/Debug/pcode-lifter --input firmwares/output/bloodlight.pcode --emit-cir --output firmwares/output/bloodlight-lifted --print-tu yields the following:
|
Making p-code is sort of the equivalent of making a whole-program statically compiled output I suppose, so I'm not really... sure how to compile the C down again in the way it originally was. The Patchestry proposal was hella light on build environment related details. So I guess I'll try the dumbest way possible (all deps are there so maybe just moreover, what about a build made on a specialized build machine to which I don't have access? how would or could I go about reproducing that environment ever? or even reverse engineering what it would have been? |
For
|
Working on figuring out if there's anything "cheap" I can do about Bloodlight not extracting properly |
Also have been taking notes on this in Slack. I so far have three problems that feed into "is the pcode we make accurate, and how accurate":
For (1) as a short term hack I used file and readelf in the headless entrypoint script to better identify the arch in question (e.g. pulseox was decompiling as firmware of an Armv8 device and readelf says it's an Arm v6-M device with EABI; bloodlight had similar misidentification issues, but more divergent, which is why it doesn't fully extract for me). I think this impacts how accurate our pcode can be.
For (2) I'm slowly figuring out how to get some unit tests for PatchestryDecompileFunctions wedged in as an option to the decomp scripts and container, the goal being to at least be able to tell if we roughly are doing what we think we are doing and maybe even to add some property-based testing if I am very lucky. Maybe there's a way I can integrate this flow with LIT (which I do see runs tests of some sort right now, but apparently not on the decomp functionality), but I want to surface test-running in the same way / environment the code is run, rather than making the tests hard to find, at the very least.
For (3) I've been working with the doc https://ghidra.re/ghidra_docs/api/ and with Claude to try to get some reasonable background and context |
Started a branch in service of this work: https://github.com/lifting-bits/patchestry/tree/kaoudis/pcode-gen-correctness-tests |
An issue discovered with p-code generation through this work is fixed in #86 |
We lose some information at each step of High Pcode -> Clang IR -> Tower of IRs -> LLVM -> Binary
There should be some way to quantify this loss, possibly by measuring re-translation on Clang's compilation test suite.
The text was updated successfully, but these errors were encountered: