Skip to content

Add crashdump example and include snapshot/scratch in core dumps#1264

Draft
jsturtevant wants to merge 1 commit intohyperlight-dev:mainfrom
jsturtevant:crashdump
Draft

Add crashdump example and include snapshot/scratch in core dumps#1264
jsturtevant wants to merge 1 commit intohyperlight-dev:mainfrom
jsturtevant:crashdump

Conversation

@jsturtevant
Copy link
Contributor

Core dumps generated by Hyperlight were missing the snapshot and scratch memory regions, making post-mortem debugging with GDB incomplete — register state was present but the guest's code, stack, heap, and page tables were absent. This adds the snapshot and scratch regions to the ELF core dump alongside any dynamically mapped regions so that GDB can show full backtraces, disassemble at the crash site, and inspect guest memory. A new runnable crashdump example demonstrates automatic dumps (VM-level faults), on-demand dumps (guest-caught exceptions), and per-sandbox opt-out, with GDB-based integration tests that validate register and memory content in the generated ELF files. The debugging docs are also updated with practical GDB commands for inspecting crash dumps.

Core dumps generated by Hyperlight were missing the snapshot and scratch memory regions, making post-mortem debugging with GDB incomplete — register state was present but the guest's code, stack, heap, and page tables were absent. This adds the snapshot and scratch regions to the ELF core dump alongside any dynamically mapped regions so that GDB can show full backtraces, disassemble at the crash site, and inspect guest memory. A new runnable crashdump example demonstrates automatic dumps (VM-level faults), on-demand dumps (guest-caught exceptions), and per-sandbox opt-out, with GDB-based integration tests that validate register and memory content in the generated ELF files. The debugging docs are also updated with practical GDB commands for inspecting crash dumps.

Signed-off-by: James Sturtevant <jsturtevant@gmail.com>
@@ -0,0 +1,599 @@
/*
Copyright 2025 The Hyperlight Authors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated question, I remember from some trainings I've taken some very long time ago, that if we add a new file we should have the year reflect that. Also, if any modifications to the file are made in a different year, then a new header with the new year should be applied.

Is that true here also? I am curious how we should approach these scenarios.

Copy link
Contributor

@devigned devigned Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I've done this in the past is to have a linter in the CI that would verify the header on files in the PR. Example inherited from long ago.

The date doesn't need to be updated. It can stay the creation date for the file. Some folks will opt for a date range $creation_year - $most_recent_update_year. For example, check out https://github.com/kubernetes/kubernetes/blob/07a1af766fd54f1f495a854ddf3e5227241fb961/pkg/api/node/util.go in the K/K repo.

println!("=== Hyperlight Crash Dump Example ===\n");

// -----------------------------------------------------------------------
// Part 1: Automatic crash dump (VM-level fault bypasses guest handler)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe guest caused crash dump or something like that is more suggestive than automatic?

let mut regions: Vec<MemoryRegion> = Vec::new();

// Snapshot region: contains guest code, read-only data, page tables, etc.
if let Some(snapshot) = &self.snapshot_memory {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know for sure about this, I'll try and run it locally, see how it looks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants