Skip to content

Backport v1.15.1 hardening to v1.6.x#3

Open
libre-man wants to merge 4 commits intorelease-1.6.5from
release-1.6.6
Open

Backport v1.15.1 hardening to v1.6.x#3
libre-man wants to merge 4 commits intorelease-1.6.5from
release-1.6.6

Conversation

@libre-man
Copy link
Copy Markdown

@libre-man libre-man commented May 1, 2026

Review-only PR — not intended to merge. The actual deliverable is the v1.6.6 tag and release; this PR exists so the diff against the previous release (release-1.6.5 branch == v1.6.5 tag) is reviewable in one place.

What this is — and what it isn't

This is a defensive hardening backport from upstream firecracker v1.15.1, not a CVE response. The only CVE issued for v1.15.1 (CVE-2026-5747) is specific to the --enable-pci virtio PCI transport, which our v1.6.5 fork does not have. Per AWS security bulletin 2026-015:

"The virtio PCI transport is opt-in via the --enable-pci command-line flag... The legacy MMIO transport is the default and is not affected by this issue."

So our fork is not exposed to CVE-2026-5747. We are backporting the three items below as defence-in-depth — they match upstream's stricter behaviour and reduce the chance of related (uncatalogued) DoS / state-machine attacks against future config changes.

Backports

Commit Upstream Form What it prevents
138c9f9 harden(rng) 755086956 (PR #5762) Manual port — couldn't cherry-pick cleanly because IoVecBufferMut::len() changed usize → u32 and the rng device was restructured Host OOM via overlapping virtio descriptors inflating iovec.len() to ~4 GiB
8ff4adf harden(virtio-mmio) 4a5219870 (PR #5818) Manual port — upstream moved transports to transport/mmio.rs, added VirtioInterruptType / DEVICE_NEEDS_RESET / 3-arg activate(). Control flow + VALID_TRANSITIONS table kept identical to upstream for line-by-line audit. Guest writing single status bits dropping cumulative bits, breaking virtio state-machine invariants. Not CVE-2026-5747 — that is the PCI variant.
1cdb57c harden(balloon) 1689689f0 (PR #5794) Cherry-pick — clean source merge; only adapted the new test for v1.6.5's 1-arg activate() and added use log::warn Guest stalls VMM event loop with a stats descriptor of huge len
c33a49b Bump version Routine

Full v1.15.1 evaluation against our v1.6.5 fork

Upstream change CVE Our fork
PR firecracker-microvm#5762 entropy cap none — DoS hardening Applied (above) — defence in depth
PR firecracker-microvm#5818 PCI fixes (3e385b8ab, 2348dabc4, bb7189970, 7f3114343, ee7b46418) CVE-2026-5747 N/A — PCI transport added in v1.13.0; our fork is v1.6.5, no PCI code exists
PR firecracker-microvm#5818 MMIO bit-drop none — defensive ("Note: virtio MMIO transport also didn't") Applied (above) — defence in depth
PR firecracker-microvm#5780 aarch64 cache topology none N/A — x86_64-only deploy
PR firecracker-microvm#5793 virtio-mem slots none N/A — no virtio-mem in our fork
PR firecracker-microvm#5794 balloon stats bound none — DoS hardening Applied (above) — defence in depth
06cffc96f balloon dup-buffer visibility none — correctness only Not applied — no security or operational relevance
PR firecracker-microvm#5809 kvm-clock monotonic jump none Already fixed in our fork (commit ec1e9ab06, different approach, equivalent protection)
e1a3db8f6 aws-lc-rs 1.16.2 bump none Could backport, but no CVE attached

Operational impact on SAFE / ATv2

None of the backported code paths are reachable from SAFE microVMs as currently configured: SAFE attaches drives, vsock and net only, no entropy/balloon devices. The mmio status-bit change does run on every microVM init, but the bit-drop pattern requires a non-compliant guest deliberately writing a degraded status — Linux guests don't do that.

These are bookkeeping fixes that align our fork with upstream and pre-empt issues if the SAFE config ever changes.

Caveat

This audit only covers v1.15.1. Upstream had ~9 minor releases between our v1.6.5 base and v1.15.1; CVEs disclosed in earlier releases haven't been re-checked here.

libre-man added 4 commits May 1, 2026 10:20
Backports upstream firecracker PR firecracker-microvm#5762 (commit 7550869). Adds a
MAX_ENTROPY_BYTES (64 KiB) cap on the per-request rand_bytes
allocation in handle_one().

The pre-fix code did `vec![0; iovec.len()]` where `iovec.len()` is
the *sum* of all descriptor lengths in a chain, not the distinct
guest memory backing them. A guest can craft 255 overlapping
descriptors each claiming 16 MiB but all pointing to the same
guest physical memory, inflating iovec.len() to ~4 GiB and
exhausting host RAM.

No CVE was assigned upstream; AWS classifies this as a host
DoS hardening rather than a security advisory.

Operationally, SAFE microVMs do not attach an entropy device, so
the unfixed code path is unreachable in our deployment. This is
defence-in-depth for any future config that does attach one.

Manual port — could not be cherry-picked cleanly because by
v1.15.x the rng device holds an owned `self.buffer` field and
process_entropy_queue has a different signature. Most importantly,
IoVecBufferMut::len() returns u32 in v1.15.x but usize in v1.6.5,
which forced a small change to the cap arithmetic and the function
return path. The actual security-relevant change (the cap itself)
is the same as upstream.
Backports the MMIO portion of upstream PR firecracker-microvm#5818 (commit 4a52198).
Replaces the match on `!self.device_status & status` with an explicit
VALID_TRANSITIONS table and equality check, so writes that drop
previously-set bits (e.g. FEATURES_OK alone after the device is in
state ACK | DRIVER | FEATURES_OK | DRIVER_OK) are rejected.

This is *not* a fix for CVE-2026-5747. Per AWS security bulletin
2026-015, that CVE is specific to the virtio PCI transport, which is
opt-in via --enable-pci and was added upstream in v1.13.0. Our fork
is based on v1.6.5 and has no PCI transport, so the CVE itself does
not apply. Upstream's PR firecracker-microvm#5818 included a parenthetical defensive
hardening of the MMIO transport ("Note: virtio MMIO transport also
didn't [enforce cumulative bits]") and that is what this commit
backports — for consistency with upstream's stricter behaviour and
defence-in-depth, not because of an active CVE.

The upstream commit could not be cherry-picked cleanly because by
v1.15.x the mmio transport lives at src/vmm/src/devices/virtio/
transport/mmio.rs, the activate() signature gained an interrupt
argument, and the failure path uses DEVICE_NEEDS_RESET /
VirtioInterruptType, none of which exist in v1.6.5. The control
flow and the VALID_TRANSITIONS contents are deliberately kept
identical to upstream so future audits can compare line-for-line.
Backports upstream firecracker PR firecracker-microvm#5794 (commit 1689689). Adds a
MAX_STATS_DESC_LEN (256 stat tags = 2560 bytes) cap on the stats
descriptor processed by process_stats_queue(). Pre-fix, a guest
could submit a descriptor with arbitrarily large `head.len`,
causing the inner loop `for index in (0..head.len).step_by(...)`
to iterate billions of times and stall the VMM event loop.

No CVE was assigned upstream; AWS classifies this as a host DoS
hardening rather than a security advisory.

Operationally, SAFE microVMs do not attach a balloon device, so
the unfixed code path is unreachable in our deployment. This is
defence-in-depth for any future config that does attach one.

Cherry-picked cleanly from upstream's source-code diff with two
mechanical adaptations for v1.6.5:
  - drop the unrelated CHANGELOG hunk;
  - drop the second 'interrupt' argument from balloon.activate()
    in the new test (added in upstream's later refactor) and
    explicitly import the warn! macro that this module did not
    yet pull in.

(cherry picked from commit 1689689f0a31f9b1107c01ae9c6ed5b6f110050e)
@libre-man libre-man changed the title Backport v1.15.1 security fixes to v1.6.x Backport v1.15.1 hardening to v1.6.x May 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant