Skip to content

Commit 3063e4a

Browse files
authored
Crashdump on demand (#972)
* Adds a function to create a core dump on demand Signed-off-by: Simon Davies <[email protected]> * Add gdb script and update docs Signed-off-by: Simon Davies <[email protected]> --------- Signed-off-by: Simon Davies <[email protected]>
1 parent 97cf2c6 commit 3063e4a

File tree

3 files changed

+118
-4
lines changed

3 files changed

+118
-4
lines changed

docs/how-to-debug-a-hyperlight-guest.md

Lines changed: 36 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -207,13 +207,12 @@ involved in the gdb debugging of a Hyperlight guest running inside a **KVM** or
207207
| └───────────────────────────────────────────────────────────────────────────────────────────────┘
208208
```
209209
210-
## Dumping the guest state to an ELF core dump when an unhandled crash occurs
210+
## Dumping the guest state to an ELF core dump
211211
212-
When a guest crashes because of an unknown VmExit or unhandled exception, the vCPU state is dumped to an `ELF` core dump file.
212+
When a guest crashes because of an unknown VmExit or unhandled exception, the vCPU state can be optionally dumped to an `ELF` core dump file.
213213
This can be used to inspect the state of the guest at the time of the crash.
214214
215-
To make Hyperlight dump the state of the vCPU (general purpose registers, registers) to an `ELF` core dump file, enable the `crashdump`
216-
feature and run.
215+
To make Hyperlight dump the state of the vCPU (general purpose registers, registers) to an `ELF` core dump file, enable the `crashdump` feature and run.
217216
The feature enables the creation of core dump files for both debug and release builds of Hyperlight hosts.
218217
By default, Hyperlight places the core dumps in the temporary directory (platform specific).
219218
To change this, use the `HYPERLIGHT_CORE_DUMP_DIR` environment variable to specify a directory.
@@ -227,6 +226,39 @@ To selectively disable this feature for a specific sandbox, you can set the `gue
227226
cfg.set_guest_core_dump(false); // Disable core dump for this sandbox
228227
```
229228

229+
## Creating a dump on demand
230+
231+
You can also create a core dump of the current state of the guest on demand by calling the `generate_crashdump` method on the `InitializedMultiUseSandbox` instance. This can be useful for debugging issues in the guest that do not cause crashes (e.g., a guest function that does not return).
232+
233+
This is only available when the `crashdump` feature is enabled and then only if the sandbox
234+
is also configured to allow core dumps (which is the default behavior).
235+
236+
### Example
237+
238+
Attach to your running process with gdb and call this function:
239+
240+
```shell
241+
sudo gdb -p <pid_of_your_process>
242+
(gdb) info threads
243+
# find the thread that is running the guest function you want to debug
244+
(gdb) thread <thread_number>
245+
# switch to the frame where you have access to your MultiUseSandbox instance
246+
(gdb) backtrace
247+
(gdb) frame <frame_number>
248+
# get the pointer to your MultiUseSandbox instance
249+
# Get the sandbox pointer
250+
(gdb) print sandbox
251+
# Call the crashdump function with the pointer
252+
# Call the crashdump function
253+
call sandbox.generate_crashdump()
254+
```
255+
The crashdump should be available `/tmp` or in the crash dump directory (see `HYPERLIGHT_CORE_DUMP_DIR` env var). To make this process easier, you can also create a gdb script that automates these steps. You can find an example script [here](../scripts/dump_all_sandboxes.gdb). This script will try and generate a crashdump for every active thread except thread 1 , it assumes that the variable sandbox exists in frame 15 on every thread. You can edit it to fit your needs. Then use it like this:
256+
257+
```shell
258+
(gdb) source scripts/dump_all_sandboxes.gdb
259+
(gdb) dump_all_sandboxes
260+
```
261+
230262
### Inspecting the core dump
231263

232264
After the core dump has been created, to inspect the state of the guest, load the core dump file using `gdb` or `lldb`.
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
define dump_all_sandboxes
2+
set pagination off
3+
4+
# Get the total number of threads
5+
info threads
6+
7+
# Loop through all threads (adjust max if you have more than 200 threads)
8+
set $thread_num = 2
9+
while $thread_num <= 200
10+
# Try to switch to this thread
11+
thread $thread_num
12+
13+
# Check if thread switch succeeded (GDB sets $_thread to current thread)
14+
if $_thread == $thread_num
15+
echo \n=== Thread
16+
p $thread_num
17+
echo ===\n
18+
19+
# Go to frame 15
20+
frame 15
21+
22+
23+
set $sb = &sandbox
24+
call sandbox.generate_crashdump()
25+
26+
set $thread_num = $thread_num + 1
27+
else
28+
# No more threads, exit loop
29+
set $thread_num = 201
30+
end
31+
end
32+
33+
echo \nDone dumping all sandboxes\n
34+
set pagination on
35+
end
36+
37+
document dump_all_sandboxes
38+
Dump crashdumps for sandboxes on all threads (except thread 1).
39+
Assumes sandbox is in frame 15 on each thread.
40+
Usage: dump_all_sandboxes
41+
end

src/hyperlight_host/src/sandbox/initialized_multi_use.rs

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -475,6 +475,47 @@ impl MultiUseSandbox {
475475
pub fn interrupt_handle(&self) -> Arc<dyn InterruptHandle> {
476476
self.vm.interrupt_handle()
477477
}
478+
/// Generate a crash dump of the current state of the VM underlying this sandbox.
479+
///
480+
/// Creates an ELF core dump file that can be used for debugging. The dump
481+
/// captures the current state of the sandbox including registers, memory regions,
482+
/// and other execution context.
483+
///
484+
/// The location of the core dump file is determined by the `HYPERLIGHT_CORE_DUMP_DIR`
485+
/// environment variable. If not set, it defaults to the system's temporary directory.
486+
///
487+
/// This is only available when the `crashdump` feature is enabled and then only if the sandbox
488+
/// is also configured to allow core dumps (which is the default behavior).
489+
///
490+
/// This can be useful for generating a crash dump from gdb when trying to debug issues in the
491+
/// guest that dont cause crashes (e.g. a guest function that does not return)
492+
///
493+
/// # Examples
494+
///
495+
/// Attach to your running process with gdb and call this function:
496+
///
497+
/// ```shell
498+
/// sudo gdb -p <pid_of_your_process>
499+
/// (gdb) info threads
500+
/// # find the thread that is running the guest function you want to debug
501+
/// (gdb) thread <thread_number>
502+
/// # switch to the frame where you have access to your MultiUseSandbox instance
503+
/// (gdb) backtrace
504+
/// (gdb) frame <frame_number>
505+
/// # get the pointer to your MultiUseSandbox instance
506+
/// # Get the sandbox pointer
507+
/// (gdb) print sandbox
508+
/// # Call the crashdump function
509+
/// call sandbox.generate_crashdump()
510+
/// ```
511+
/// The crashdump should be available in crash dump directory (see `HYPERLIGHT_CORE_DUMP_DIR` env var).
512+
///
513+
#[cfg(crashdump)]
514+
#[instrument(err(Debug), skip_all, parent = Span::current())]
515+
516+
pub fn generate_crashdump(&self) -> Result<()> {
517+
crate::hypervisor::crashdump::generate_crashdump(self.vm.as_ref())
518+
}
478519
}
479520

480521
impl Callable for MultiUseSandbox {

0 commit comments

Comments
 (0)