Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Everything that is bad about signal passing #709

Open
KenGordon opened this issue Aug 3, 2020 · 6 comments
Open

Everything that is bad about signal passing #709

KenGordon opened this issue Aug 3, 2020 · 6 comments
Assignees
Labels
p0 Blocking priority

Comments

@KenGordon
Copy link
Collaborator

Catch all for fixes to hopefully solve a whole bunch of actual bugs.

@github-actions github-actions bot added the needs-triage Bug does not yet have a priority assigned label Aug 3, 2020
@KenGordon KenGordon self-assigned this Aug 3, 2020
@davidchisnall
Copy link
Contributor

We run signals here: https://github.com/lsds/lkl/blob/86819ac5446e1fa31ca67db68aa165bbf07b90a9/arch/lkl/kernel/syscalls.c#L234

This does a loop here, popping signals off the pending list and delivering them:

https://github.com/lsds/lkl/blob/86819ac5446e1fa31ca67db68aa165bbf07b90a9/arch/lkl/kernel/signal.c#L68

There are a few things wrong here:

  • Signals are not delivered to the right thread, they are delivered to whichever thread returns from a syscall.
  • When a syscall returns early as a result of a signal being delivered, the signal delivery code does not set the correct return value.
  • When a signal is delivered, it does not wake up pending tasks.

I think the correct fix is to:

  1. Iterate through pending signals.
  2. Associate them with the task structs for the correct Linux threads.
  3. Wake up any of the tasks that are currently in the sleeping state.
  4. Run the scheduler to ensure that they actually wake.
  5. Handle only the signals intended for the current thread on syscall return.

We probably need to do steps 1-4 in the idle task loop so that we wake up threads with pending signals even if no threads issue syscalls:

https://github.com/lsds/lkl/blob/86819ac5446e1fa31ca67db68aa165bbf07b90a9/arch/lkl/kernel/syscalls.c#L277

@davidchisnall davidchisnall added p0 Blocking priority and removed needs-triage Bug does not yet have a priority assigned labels Aug 3, 2020
@prp
Copy link
Member

prp commented Aug 3, 2020

Our code for injecting signals into LKL looks quite different from that of other Linux architectures:

@davidchisnall
Copy link
Contributor

Note that for synchronous signals, we switch to a host task explicitly

This is the correct thing to do for sync signals because they are delivered to an lthread, but the Linux scheduler may be pointing at the wrong thread. This won't work with async signals because they'd run in the correct host task but the wrong lthread.

@KenGordon
Copy link
Collaborator Author

KenGordon commented Aug 7, 2020

This is a catch all issue for things to do with signals.

See also

#680 Incomplete/unsafe signal handling with SGX1
#644 Signals not delivered to correct thread
#700 [TEST]: Signal mask is unset for the signal during its handler execution (maybe a dup of 644)
#518 [Tests] Issues with signal delivery using tgkill() to threads
#209 Async signals aren't always delivered as expected

@hukoyu
Copy link
Collaborator

hukoyu commented Aug 20, 2020

Enable busybox ping network test when this bug is fixed. #809
cc @KenGordon @SeanTAllen

@KenGordon KenGordon linked a pull request Aug 25, 2020 that will close this issue
@KenGordon
Copy link
Collaborator Author

See also #838 for the futex issue that prevents tgkill01 and rt_sigqueueinfo tests passing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p0 Blocking priority
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants