Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wasmtime c-api wasi questions #10162

Open
mihaly-sisak opened this issue Jan 30, 2025 · 5 comments
Open

Wasmtime c-api wasi questions #10162

mihaly-sisak opened this issue Jan 30, 2025 · 5 comments

Comments

@mihaly-sisak
Copy link

I started to look around the wasi parts of the project and come to the realization that there is little customization in io when using the rust wasi c-api.

So I started to look around, how can I implement a small part of the proposed wasi interfaces.
I found this proposal: https://github.com/WebAssembly/wasi-cli/blob/main/command.md
I have a ton of questions.

  • wasi:io/error: Is this a special object created with wasmtime_error_new()?
  • wasi:io/poll: All defined resources (like pollable) are host-managed wasm_val_t.of.ref host pointers?
  • How can I detect that a variable got dropped out of scope? Gc-drc collects inside the store? Docs says a store has no garbage collection. What does gc-drc collects?
  • wasi:io/poll: How to get a list<pollable> of objects? If I call the poll() func with a list of 3 pollable resources what do I get? Are they just a list of params? Will wasm_func_param_arity() return 3? Is it a table?
  • wasi:io/streams: How to return a result<>? Same question like the list<> argument but in reverse. How to return list<u8>, if the datatypes avaliable are i32, u32, i64, u64? Is it packed?
  • wasi:io/streams: With write() what is the return _? What to return here?
  • wasi:io/streams: At the end of blocking-write-and-flush() do I return the result of check-write()?
  • wasi:cli/environment: How to return a list<(string, string)>? is it a list<tuple<string, string>>? How strings are handled?
@pchickey
Copy link
Contributor

pchickey commented Jan 30, 2025

Hi, that's correct, the C API only exports a very limited set of ways to configure a WASI 0.1 (aka Preview 1) implementation.

However, WASI has moved on to version 0.2, and with it wit is no longer based on bare Wasm Modules but on the Wasm Component Model proposal, which introduces a type system and interface definition language called wit. The wasi-cli repo link you refer to is the wit rendered as markdown. The ABIs for WASI 0.1 and 0.2 are completely different, and programs that are built to run on WASI 0.2 will not run on an implementation that only provides WASI 0.1. (There is a way to transform modules written against WASI 0.1 into components that use WASI 0.2.)The wasmtime-wasi crate provides both a 0.1 and 0.2 implementation, but only the 0.1 implementation is exposed when using the C API.

Unfortunately, the reason its not obvious for how to use any of those wit based types with wasmtime's C API is that wasmtimes C API does not yet have support for the component model. This is a long-standing bug that remains open because it is a big undertaking, and none of the core contributors who build and maintain Wasmtime as part of their jobs have been able to prioritize it, since our production use cases all use wasmtime through its native Rust API.

So, with that context, my suggestion is that you can take a look at using Wasmtime's Rust interfaces directly, where if you need a custom WASI implementation you have far more options, for example you could fork and modify the wasmtime-wasi crate, you could write a completely different implementation in Rust on top of the bindings created by wasmtime::component::bindgen!, such as in the example here. Alternatively, you can take on the C API issue yourself, which as warned above is also a pretty big undertaking.

@mihaly-sisak
Copy link
Author

Okay, this seems more complicated than I anticipated. Sadly I do not know Rust, I like the idea of lifetimes and the borrow checker, after some initial misunderstandings I like cargo.

https://github.com/WebAssembly/WASI/blob/main/legacy/preview1/docs.md
So this is the official preview1 as far as I can tell. However I feel like I cant even re-implement this, since there are Pointer<u8>s in there, and I have no way of accessing that. Can I get strings from the wasm environment to the host and vice versa with the current c-api? Is the pointer just an offset to wasmtime_memory_data()? Are the u8s packed with a \0 at the end?

https://github.com/bytecodealliance/wasm-micro-runtime/blob/main/core/iwasm/libraries/libc-builtin/libc_builtin_wrapper.c#L1044
I found this, just to make sure, this is not an official way of doing things, this is providing a stripped down libc to the wasm environment?

@alexcrichton
Copy link
Member

@mihaly-sisak it looks like you might be diving in the deep end a bit implementing WASIp1 from scratch. While certainly possible you're probably not going to get too too much support in doing so as WASIp1 is largely "on the way out" in favor of WASIp2. WASIp1 represents a snapshot in the development of WASI, and like all snapshots of points in time it's not perfect in the sense that it's not 100% documented with intros/tutorials/examples/etc. If you're going to implement WASIp1 from scratch you'll need to be doing a lot of research on your own for something like this.

Wasmtime does not support non-standard and ad-hoc extensions to wasm like WAMR does. Wasmtime does not support creating defacto standards that "just happen to work for now" and proliferating them. While you are well within your right to build your own libc imports in your host so the guest doesn't have to have its own libc that something you'll be making as an embedder and not something we'll be building in.

@mihaly-sisak
Copy link
Author

mihaly-sisak commented Jan 31, 2025

Thank you for taking the time to answer my questions and helping me tumbling through Cargo, Rust and WASM.

I think implementing the whole proposal would be a little more than I could chew. Luckily I only need stdin/stdout, only in a very limited way. No files, no sockets. That little part I feel could be possible.

I feel like the only thing standing in my way is my ability or inability to pass strings from-to the wasm sandbox.
Looks like wasm->host is just taking the pointer, which will be an offset to the module memory.
What I am unsure about is the host->wasm direction, as I would need to allocate memory inside wasm and pass its offset. I am not sure about how to achieve that.

I would love to leverage cranelifts excellent execution speed. I really like WASM, and the best thing is just how many input languages it supports already, and I understand that could only work if there are standards and no ad-hoc additions.

Edit: #4372 It seems like stdin/stdout feels easy but is actually difficult to get right.

@alexcrichton
Copy link
Member

Yes what you're diving into here is sort of the deep end of trying to implement all of this. What you're struggling with are really low level details of how wasm is implemented and WASI works today. Basically the purpose of WASIp2 and the component model are so you don't have to deal with questions like this (you're not alone, everyone is confused by this).

The main thing to understand is that there's not really much magic involved here. For example strings are "just" a wasm pointer (plus maybe a length) and a wasm pointer is an offset from the base of linear memory. That means to interpret a wasm pointer you'll need to use, for example wasmtime_memory_data (after bounds-checks with wasmtime_memory_data_size). The wasm pointer is then offset from that in a contiguous chunk of bytes. I would very much caution about taking care doing this in C as this is where vulnerabilities and segfaults are very easy to write. If you do this all yourself you're going to be writing your own syscall layer effectively which is very error-prone and mistakes are often exploitable.

The other thing is that there's no way to allocate memory in the guest from the host. That's just not something that's inherently part of wasm that Wasmtime could magically do for example. Instead what you'll need to do is have the module export a "malloc" function and then you call that to get a pointer.

Again though I'll really caution quite a lot about all of this. This is a custom syscall layer which deals with really low-level details of how wasm<->host communication works. This is very error-prone code and even on the best of days it's difficult to get right. At the same time though I'll definitely acknowledge as well though that WASIp2/components aren't supported in the C API at this time, and that part definitely isn't helping alas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants