Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

calloop event loop question #324

Open
LunNova opened this issue Feb 26, 2024 · 4 comments
Open

calloop event loop question #324

LunNova opened this issue Feb 26, 2024 · 4 comments

Comments

@LunNova
Copy link

LunNova commented Feb 26, 2024

Not completely confident I've understood this correctly but:

It looks like cosmic-comp creates a single calloop EventLoop and handles all rendering within this. calloop isn't threaded so there's a single OS thread handling all events that's responsible for rendering even when multiple output devices are used.

APIs in use within cosmic-comp mostly shouldn't block for very long but they will still take some time to run, so something slow happening while compositing for one output can delay rendering for an unrelated output. and if I understand correctly some copies that happen inside the blit_frame_result call in kms/mod.rs implicitly sync waiting for the source app to be ready so if an app does something slow it could hang that thread

Is that correct?

@Drakulix
Copy link
Member

Drakulix commented Feb 26, 2024

APIs in use within cosmic-comp mostly shouldn't block for very long but they will still take some time to run, so something slow happening while compositing for one output can delay rendering for an unrelated output.

Yes this is correct. Long term rendering needs to be split into it's own threads (per-output) to not be bottlenecked.

and if I understand correctly some copies that happen inside the blit_frame_result call in kms/mod.rs implicitly sync waiting for the source app to be ready so if an app does something slow it could hang that thread

Not necessarily.

Firstly we do no use client buffers, that are not ready. We do poll the dmabufs and delay commits for window contents still being rendered (if the driver supports it, afaik the polling dmabufs of the nvidia driver will always return ready, nothing we can do about that). So we should never block on unfinished client work.

Secondly the wait necessary to perform that copy can be done in the GPU context, if the api supports it. A cpu-bound wait is only introduced, if the necessary EGL extensions aren't supported (nvidia...). Additionally the buffer is submitted to KMS before copying for screen-capture is happening, so it should never block scan-out.

(It's a young compositor, so it has it's issues - see threading - but it's not a naive one and lots of optimization work has already been done.)

@LunNova
Copy link
Author

LunNova commented Feb 26, 2024

Glad to hear that improving it is planned already and that it mostly shouldn't cause problems as things are non-blocking where possible.

Specifically for that nvidia case I believe there's a fix involving EGL_ANDROID_native_fence_sync which kwin implements that could help. This was added in nvidia 545.

It's really exciting that this compositor is in rust. This codebase feels a lot more approachable than kwin's does and I'm tempted to start hacking on it and see if I can get it to feel as responsive as compositorless X does.

@Drakulix
Copy link
Member

Drakulix commented Feb 26, 2024

Specifically for that nvidia case I believe there's a fix involving EGL_ANDROID_native_fence_sync which kwin implements that could help. This was added in nvidia 545.

This is the extension we are querying. Unfortunately that only works for cases, where the fence is local to the nvidia gpu (yet). So synchronizations across GPUs involving the nvidia driver isn't working (the sync files fail to import) and their KMS api also doesn't support fences yet, so that we can't rely on this for scan-out. But all the necessary support code is there, so once the driver advertises the necessary capabilities (or the imports succeed), that should just work, like it does for mesa drivers today.

It's really exciting that this compositor is in rust. This codebase feels a lot more approachable than kwin's does and I'm tempted to start hacking on it and see if I can get it to feel as responsive as compositorless X does.

Glad to hear that!
With all the feature development still going on, optimizations are of course still happening, but more with a focus on good performance and low-hanging fruits as opposed to perfect performance. Any contributions on that subject are very welcome, also for the underlying framework smithay, where for example work to reduce allocations for the main rendering path is happening right now: Smithay/smithay#1346 .

@ryanabx
Copy link
Contributor

ryanabx commented Feb 28, 2024

Jumping into this discussion to say: join the smithay matrix server! https://matrix.to/#/#smithay:matrix.org

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants