Skip to content

feat(ci): add release-vm-dev pipeline and install-vm.sh installer#788

Open
drew wants to merge 18 commits intomainfrom
dnewberry/os-48-create-a-vm-release-channel-that-includes-kernel-builds-and
Open

feat(ci): add release-vm-dev pipeline and install-vm.sh installer#788
drew wants to merge 18 commits intomainfrom
dnewberry/os-48-create-a-vm-release-channel-that-includes-kernel-builds-and

Conversation

@drew
Copy link
Copy Markdown
Collaborator

@drew drew commented Apr 9, 2026

Summary

Fix the release-vm-dev CI pipeline and add a one-liner install script for openshell-vm binaries.

Related Issue

Closes #48

Changes

CI pipeline fixes (release-vm-dev.yml)

  • Add CI container to download-kernel-runtime job so gh CLI is available (bare runner didn't have it)
  • Fall back to plain cargo build in build-rootfs.sh when cargo-zigbuild is unavailable (works for native arch builds in CI)
  • Install zstd in build-rootfs, build-vm-linux, and build-vm-macos jobs (not present in CI container image)
  • Add zstd to Dockerfile.ci for future image builds

Kernel build fixes (release-vm-kernel.yml)

  • Build kernel once on Linux, reuse kernel.c on macOS via artifact sharing
  • Fix kernel-dir resolution, build dependency installation, and macOS LLVM/sccache conflicts

Install script (install-vm.sh)

Release notes

  • Both release-vm-dev.yml and release-vm-kernel.yml include quick install snippet in release body

Testing

  • release-vm-dev pipeline passes end-to-end (run 24177054398)
  • All 8 jobs green: Compute Versions, Download Kernel Runtime, Build Rootfs (amd64/arm64), Build VM (Linux amd64/arm64, macOS), Release VM Dev
  • E2E tests added/updated (if applicable)

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • Architecture docs updated (if applicable)

drew added 14 commits April 8, 2026 22:06
Add packages:read permission so Linux container jobs can pull the CI
image from GHCR, and reorder brew tap before brew install so the
libkrunfw and krunvm formulae are discoverable on macOS.
Eliminate the krunvm/Fedora VM dependency from the macOS CI job by
building the aarch64 Linux kernel only on the Linux ARM64 runner and
exporting kernel.c as a CI artifact. The macOS job downloads kernel.c
and compiles it into libkrunfw.dylib with Apple's cc, cutting macOS
CI from ~45 min to ~5 min.

Also fixes:
- sudo not found in CI containers (use conditional SUDO)
- Hardcoded ARCH=arm64 in build-libkrun.sh (now auto-detects)
- Missing packages:read permission for GHCR pulls
- brew tap ordering for Homebrew formula resolution

Removes build-custom-libkrunfw.sh (no longer needed).
- Add cpio to build dependencies (required by CONFIG_IKHEADERS)
- Disable CONFIG_IKHEADERS in kconfig fragment (not needed in VM)
- Add pip fallback for pyelftools when apt package isn't importable
- Add python3-pip to apt dependencies
Both are needed by package-vm-runtime.sh for tarball compression
and provenance metadata generation.
The macOS build script changes directory to the build dir before
compiling kernel.c, so relative paths passed via --kernel-dir
would fail to resolve.
Homebrew's rust package links against Homebrew's llvm, which conflicts
with the lld package's LLVM version. Use mise to install a standalone
Rust toolchain (via rustup) that ships its own LLVM.
Use rustup instead of mise or Homebrew for Rust to avoid LLVM
conflicts. Set RUSTC_WRAPPER='' to disable sccache which is not
available on the macOS runner.
- Add CI container to download-kernel-runtime job so gh CLI is available
  (bare build-amd64 runner does not have gh installed)
- Fall back to plain cargo build in build-rootfs.sh when cargo-zigbuild
  is not available (works for native builds in CI where arch matches)
- Install zstd inline in build-rootfs workflow step (immediate fix)
- Add zstd to CI Dockerfile for future image builds
The CI container image does not include zstd, which is needed to
decompress kernel runtime tarballs and re-compress individual files
for embedding into the openshell-vm binary.
Adds a POSIX sh install script (modeled after install.sh) that detects
the platform, downloads the correct binary from the vm-dev release,
verifies checksums, and codesigns on macOS automatically. Works when
piped from curl into any shell (bash, zsh, fish, etc.).

Updates both release-vm-dev and release-vm-kernel workflow bodies to
include the quick install snippet.
- Add redirect origin validation (MITM defense, ref issue #638)
- Add resolve_redirect() helper matching install.sh
- Add warn() helper
- Fix checksum tool preference order (shasum first, matching macOS default)
- Use caller's tmpdir for entitlements plist (cleaned by trap)
- Add version probe after install
- Add full PATH guidance with shell config file hints
- Add examples to --help output
Add custom install dir example and description of what the installer
does (platform detection, checksum verification, macOS codesign).
@drew drew self-assigned this Apr 9, 2026
@drew drew requested a review from a team as a code owner April 9, 2026 07:33
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Apr 9, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

drew added 4 commits April 9, 2026 00:38
GitHub now redirects release asset downloads to this domain in
addition to objects.githubusercontent.com.
libkrun.dylib references libkrunfw via @loader_path/libkrunfw.dylib
(unversioned name set by build-libkrun-macos.sh), but the embedded
runtime only extracts libkrunfw.5.dylib (versioned). Create an
unversioned symlink so dyld can resolve the dependency.
libkrun internally does dlopen("libkrunfw.5.dylib") with a bare name.
On macOS, dyld ignores DYLD_FALLBACK_LIBRARY_PATH set after process
start, and the RTLD_GLOBAL preload doesn't help because dyld's dlopen
search doesn't match against install names of already-loaded libraries.

Fix by re-execing the binary early in main() with DYLD_LIBRARY_PATH
set to the runtime directory, so the dynamic linker can find
libkrunfw.5.dylib when libkrun requests it.
gvproxy's internal netstack may not be ready when the expose API is
called immediately after socket creation, causing HTTP 500 responses.
The port forward silently failed, leaving host port 30051 unmapped and
causing the gateway health check to time out after 90s.

Add retry logic with exponential backoff (100ms to 1s, 10s budget) and
fail the launch if retries are exhausted instead of silently continuing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant