Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to compute my program hash? #58

Open
tdelabro opened this issue Dec 19, 2024 · 17 comments
Open

how to compute my program hash? #58

tdelabro opened this issue Dec 19, 2024 · 17 comments

Comments

@tdelabro
Copy link
Contributor

swiftness verify returns the program's hash.
Looking at the implementation:

        let program: Vec<&Felt> = memory
            .iter()
            .skip(initial_pc.to_bigint().try_into()?)
            .step_by(2)
            .take((program_end_pc - FELT_1).to_bigint().try_into()?)
            .collect();

        let hash = program.iter().fold(FELT_0, |acc, &e| pedersen_hash(&acc, e));
        let program_hash = pedersen_hash(&hash, &Felt::from(program.len()));

It looks like it is building the based on the program segment present in the public inputs.
When debugging, program.len() was set to 538.

When I look to my public-input.json generated by cairo1-run I find:

    "program": {
      "begin_addr": 1,
      "stop_ptr": 56
    },

So this doesn't match.

What part of the files generated during the proving of my program should I use to compute my program hash and check that it does match the one outputted by the verifier?

@tdelabro
Copy link
Contributor Author

tdelabro commented Dec 19, 2024

Okay, I get it now:
You start from program.begin_addr and go up to execution.begin_addr -2.

    "execution": {
      "begin_addr": 541,
      "stop_ptr": 731
    },

So 1 -> 541-2 (= 538, exclusive).

I guess I need someone to explain to me what is stop_ptr if not the end of the segment?

@tdelabro
Copy link
Contributor Author

If you don't mind I will factorize the logic to compute program hash and expose it as some utils method.
Do you know if another tool in the ecosystem provides the same functionality?

@Okm165
Copy link
Contributor

Okm165 commented Dec 20, 2024

The only relevant implementation in the ecosystem of pub mem validation for cairo0 python-vm
https://github.com/starkware-libs/cairo-lang/blob/master/src/starkware/cairo/cairo_verifier/layouts/all_cairo/cairo_verifier.cairo#L61-L143

@Okm165
Copy link
Contributor

Okm165 commented Dec 20, 2024

Yeah the program is held between program start_ptr and execution start_ptr ( - 2 cells due to function stack of cairo cpu )
I find it also confusing tbh ideally program is defined between program start_ptr and end_ptr.
If you get info about why it is defined this way let me know

@tdelabro
Copy link
Contributor Author

Yup I got this explanation from @LandauRaz
The values in public-input.json (start_ptr and end_ptr) are for the compiled program's "execution mode". However, when executing in "proof mode," more code is added, which is reflected by this longer span of the program segment.

@tdelabro
Copy link
Contributor Author

After asking internally, it seems that there is no lib nor tool in the ecosystem that allows to compute a program hash.
They also recommended not to overcharge already existing big repositories with this additional logic, therefore I will create a standalone rust lib/executable to do just that.
When ready and tested, you will be able to use it as a dependency too if you want

@Okm165
Copy link
Contributor

Okm165 commented Dec 20, 2024

Can u tell more about standalone rust lib/executable interface?
Is it something that will accept public memory and just calculate program hashes and/or output hash?
If so the swiftness can return raw public memory that can be passed to standalone rust lib/executable to process and/or extract relevant information.

@tdelabro
Copy link
Contributor Author

tdelabro commented Dec 20, 2024

3 actors: Bob, Alice and Charlie.
Bob creates a program.
Bob gives Charlie a program hash and expected output.
Bob gives Alice the executable or code of the program.
Alice executes the program with her own secret inputs. Generate the trace. Generate the proof.
Alice sends the proof to Charlie.
Charlie verifies the proof, and verifies the proof's program hash and output against the one given to him by Bob.
If all of this is correct, Charlie will allow some business logic to be executed. (eg. could be allowing Alice to use Bob's assets)

In this design, Bob should be able to generate the program hash without knowing Alice's private inputs.
So I have to have another set of valid public inputs that will allow him to execute the program to it's end in "proof mode" with cairo1-run, hence generating a valid air_public_input, allowing him to compute the program hash from the public-input.json file.

So in this design, the input for the standalone rust lib/executable is the air_public_input file outputted by cairo1-run

It would be better if Bob was able to compute the program hash without having to run it successfully, but as for now, it looks like it cannot get the full "proof mode" program hash, without executing it to the end.

Edit: actually he doesn't need valid inputs, as long as his program doesn't panic and just returns an error (Sierra's provable failure). Afaik the program hash will be the same, regardless of the execution outcome.

@tdelabro
Copy link
Contributor Author

tdelabro commented Dec 20, 2024

@Okm165 I think your implementation of the program hash computation is buggy.

Here is the content of my public-input.json:

{
  ...
  "memory_segments": {
   ...
    "program": {
      "begin_addr": 1,
      "stop_ptr": 56
    },
   ...
    "execution": {
      "begin_addr": 541,
      "stop_ptr": 731
    },
    ...
  },
  "public_memory": [
    {
      "address": 1,
      "value": "0x40780017fff7fff",
      "page": 0
    },
    {
      "address": 2,
      "value": "0x3",
      "page": 0
    },
    ... contiguous memories cell
    {
      "address": 534,
      "value": "0x208b7fff7fff7ffe",
      "page": 0
    },
    {
      "address": 535,
      "value": "0x208b7fff7fff7ffe",
      "page": 0
    },
    {
      "address": 731,
      "value": "0x0",
      "page": 0
    },
    {
      "address": 732,
      "value": "0x1",
      "page": 0
    },
    {
      "address": 733,
      "value": "0xbabecafe",
      "page": 0
    },
    ...
  ],
  ...
}

The start of the program is at address 1, and we take initial_fp - FELT_3 elements, which is 538 elements. That's the value was getting when debugging program.len().
But if you look at the content of the memory, there is no 538 contiguous memory cells for the program. There is 535 of those, and then the next public memory value has address 731.
addresses 731, 731, 733 are my public outputs
their values: 0, 1, 0xbabecafe
they are:
something_I_don't_know_about, n_public_outputs, the_single_value_my_program_is_returning

With the current state of the program, those 3 values are included in the computation of the program hash. But the output could be input dependent. So not at all part of the program. Coz with two different private inputs you could end up with two different program hash. For the same program. Therefore the bug.

I have no idea what is supposed to between 535 and 731 but it is not for us to look at.
I think the bug can be solved by replacing:
.take((program_end_pc - FELT_1).to_bigint().try_into()?)
by
.take_while(|(address, value)| address < program_end_pc )

@Okm165
Copy link
Contributor

Okm165 commented Dec 20, 2024

That is correct this is a bug for sure to be fixed
In Cairo0 output_ptr is behaving differently though.
The program hash is calculated correctly if Cairo0 python-vm is used.
Yeah I would love to have any spec or procedure to extract program and output in both cases (Cairo0 & Cairo1)
Ideally in one generic function so no flags from user side required

@tdelabro
Copy link
Contributor Author

After much internal discussion, it seems that changing the logic to use from program.begin_addr up to execution.begin_addr is the best solution.

There is also some ongoing discussion about the use or not of the bootloader.
From my understanding, your verifier code is only valid (especially the program hash computation) for programs proved with the bootloader

@tdelabro
Copy link
Contributor Author

Another issue is most likely there:

        ensure!(initial_pc == INITIAL_PC, PublicInputError::MaxSteps);

the main entry point is not guaranteed to be at address 0x1. It's program-dependent.
But your code does enforce it.
Is it always the case for the bootloader?

@Okm165
Copy link
Contributor

Okm165 commented Dec 24, 2024

Stark proof verification is valid for both cairo0 and cairo1 no matter bootloading.
The program hash computation is valid for cairo0 memory shape, it can come from proven cairo0 circuit or bootloader circuit (this is same coz bootloader is just cairo0 program), it can also come from proof of cairo1 bootloaded in cairo0.

The issue here is cairo1-run introduces slightly different shape of memory, so these checks characteristic for cairo0 memory are not applicable, so anything that comes straight from cairo1-run is gonna cause issue in this checks after stark proof validation.

@tdelabro
Copy link
Contributor Author

The way you compute the program hash only takes into account the "program" segment. I can use the same "program" in two different ways if I use a different entry point, so, in order to be truly unique, the program hash should take the main_entrypoint into account

@Okm165
Copy link
Contributor

Okm165 commented Dec 28, 2024

In the case of Cairo1 contract bootloader the entrypoint is hardcoded to be a main that corresponds to selector 0x00e2054f8a912367e38a22ce773328ff8aabf8082c4120bad9ef085e1dbf29a7. So there is no freedom of using arbitrary entry-point to the program in this case. Cairo0 bootloading also has deterministic entrypoint.

@tdelabro
Copy link
Contributor Author

tdelabro commented Jan 7, 2025

Yep, it works when using the bootloader. But, afaik, I am not required to use the bootloader.
The codebase seems to be built on the assumption that I'm using it, yet I wasn't and got no warning of any sort.

I could be fine with swiftness only supporting bootloaders-executed programs, but you have to specify it in the doc, and return an error if I'm not complying.

@Okm165
Copy link
Contributor

Okm165 commented Jan 21, 2025

Preparing update in docs that will mention how to prove Cairo1 with https://github.com/zksecurity/stone-cli and verify it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants