Skip to content

GWP-177 Design Pack — kn86fw bootfs/rootfs producer extension

Today, two artifacts come out of the system-image pipeline:

  1. tools/sd-provision/build.sh invokes pi-gen and copies a single .img (the full six-partition A/B SD layout per ADR-0011) to tools/sd-provision/build/kn86-os-vDEV.img. That .img is what an operator flashes onto a fresh microSD with rpi-imager (or what the desktop flasher will write whole during initial provisioning).
  2. tools/kn86fw/ wraps an opaque payload file with the 128-byte .kn86fw header (magic, format version, semver fields, payload SHA-256, min-bootloader-version) — the on-the-wire update package consumed by the desktop flasher and the Pi-side verifier. Phase 0 explicitly treats --input as opaque bytes.

What is missing — and what update-system.md and ADR-0011 already commit to — is the slot artifact: the bootfs partition image and the rootfs partition image, emitted as separate, verifiable files so they can (a) be wrapped by kn86fw build into a .kn86fw for the field-update flow, (b) be flashed to the inactive A/B slot by kn86flash (Wave 2 Tauri) without disturbing the user’s /home/shared (p6) or the active slot, and (c) feed the cartridge-MSC SD-card pipeline (where the device exposes the inactive slot to the host as USB-MSC, and the host writes a per-partition image straight in).

The current kn86fw README’s “Phase 0 scope” callout names this work explicitly: “A future Wave 2 tool will assemble bootfs + rootfs slot images from a Pi OS Lite build and hand the resulting blob to kn86fw build for packaging.” GWP-177 is that work. This design pack picks the producer’s shape, locks the file formats, and inventories the surface area changes inside tools/kn86fw/ — without writing the producer.

The decision to defer implementation past v0.1 is correct: pi-gen is producing the full .img, the .kn86fw wrapper is shipped, the SD layout is canonical in ADR-0011, and the flasher’s whole-image path covers Stage 0 bring-up. Per-partition production is what makes field updates affordable (you don’t reflash the full SD; you write 256 MB bootfs + ~2 GB rootfs to the inactive slot only). That problem only becomes urgent after v0.1 ships and the device starts accumulating user state on p6 that re-imaging would destroy.

Producer shape (the load-bearing decision)

Section titled “Producer shape (the load-bearing decision)”

There are three plausible shapes; this design pack picks option C with option B as the immediate v0.1 fallback:

OptionWhatProsCons
A. Full producer (kn86fw replaces pi-gen)kn86fw produce-bootfs and kn86fw produce-rootfs build the partition contents from a stage manifest in pure Rust — equivalent to a from-scratch debootstrap + Pi firmware bundling.Single-tool story; no shell + Docker dependency.Re-implements pi-gen. ~weeks of work + maintenance burden. The CLAUDE.md “Platform Engineering” surface explicitly leans on pi-gen for vendor-firmware bundling and tryboot support. Reject.
B. Wrapper (kn86fw build calls pi-gen under the hood)kn86fw build-image shells out to tools/sd-provision/build.sh, then post-processes.Convenient for callers; one CLI surface.Couples a Rust CLI to a bash + Docker pipeline; CI ergonomics are worse not better; obscures the decoupled nature of the two steps. Reject as primary.
C. Post-processor (kn86fw splits the existing .img)kn86fw split <image.img> reads the GPT/MBR partition table from a pi-gen .img, extracts the bootfs (p2) and rootfs (p4) partition byte ranges, and writes them as separate artifacts (raw partition image + manifest). Adds kn86fw produce-update-bundle that wraps a (bootfs, rootfs) pair into the existing .kn86fw payload format.Clean separation: pi-gen owns image construction, kn86fw owns image consumption and packaging. Reuses the existing crate’s strengths (header writing, SHA-256, deterministic output). Backwards-compatible: tools/sd-provision/build.sh keeps working unchanged; the new subcommands are additive.Requires a tiny GPT/MBR parser in Rust (one dependency: gpt = "3" or hand-rolled — the layout is fixed by ADR-0011 so a hand-rolled fixed-offset reader is also viable).

Decision: option C. Producer shape is post-processor, not a wrapper and not a re-implementation. The .img from tools/sd-provision/build.sh is the input; per-partition artifacts and the .kn86fw update bundle are the outputs. This preserves Spec Hygiene Rule 1 (single source of truth for the partition layout — ADR-0011 — is consumed by both the producer of the .img and the splitter).

The wrapper convenience (option B) lands as a thin bash one-liner in tools/sd-provision/build.sh that invokes the new kn86fw split after the .img copy step — not as Rust code calling out to bash. That keeps the dependency direction clean.

Three new subcommands extend the existing build / inspect / verify set:

Terminal window
# 1. Split an existing .img into per-partition raw artifacts + a manifest.
kn86fw split \
--input build/kn86-os-vDEV.img \
--output build/slot-artifacts/ \
[--partitions bootfs,rootfs] # default: bootfs+rootfs only; never p1/p6
# Output layout (deterministic file names, sorted manifest):
# build/slot-artifacts/
# ├── bootfs.img (raw 256 MB FAT32 image — exact p2 byte range)
# ├── bootfs.sha256
# ├── rootfs.img (raw ext4 image — exact p4 byte range)
# ├── rootfs.sha256
# └── manifest.toml (source .img path, source SHA-256, partition table snapshot,
# build timestamp, kn86fw version, schema_version)
# 2. Produce a .kn86fw update bundle from a (bootfs, rootfs) pair.
kn86fw produce-update-bundle \
--bootfs build/slot-artifacts/bootfs.img \
--rootfs build/slot-artifacts/rootfs.img \
--version 0.2.0 \
--nosh-version 0.2.0 \
--output build/kn86-v0.2.0.kn86fw
# 3. Inspect a slot-artifacts directory (parity with `kn86fw inspect` for .kn86fw files).
kn86fw inspect-slot build/slot-artifacts/
# Prints partition sizes, SHA-256s, source .img origin, manifest schema version.

build keeps its existing single-file payload contract verbatim (no breaking change to the Phase 0 surface). produce-update-bundle is a higher-level convenience that internally concatenates bootfs.img + rootfs.img deterministically (header + length-prefixed sections) and calls the existing cmd_build::run — it does not bypass the existing header writer. This means every .kn86fw produced by either path passes the existing verify flow unchanged.

Update-bundle inner format (the payload that produce-update-bundle hands to cmd_build::run)

Section titled “Update-bundle inner format (the payload that produce-update-bundle hands to cmd_build::run)”
offset size field
------ ------------------ -----
0x00 8 bundle_magic[8] = "KN86SLOT"
0x08 2 bundle_version = uint16_t LE = 1
0x0A 2 _reserved = 0x0000
0x0C 4 section_count = uint32_t LE = 2 (bootfs, rootfs)
0x10 16 section_table[0] = { kind: u32, offset: u64, length: u64, sha256: [u8; 32] }... no wait — sized below

Section table entries are 56 bytes each (u32 kind + 4 pad + u64 offset + u64 length + [u8; 32] sha256); the inner header is 16 + 56 × section_count bytes. Section payload bytes follow contiguously, padded to 4 KiB alignment per section. The exact byte layout for this inner format is an open question (see #3 below) — the design pack commits to having an inner format with a magic, a section table, and per-section SHA-256s, but the precise field widths are decided when the producer is implemented (alongside the matching C header at tools/kn86fw/format/kn86slot.h, mirroring the kn86fw.h discipline).

The outer .kn86fw header is unchanged. payload_sha256 covers the entire KN86SLOT blob (header + sections), exactly as for any other payload.

SurfaceFormatWhy
bootfs.imgRaw FAT32 partition image (exact p2 byte range from the source .img).Mountable on a host with mount -o loop. The Pi-side flash path is dd if=bootfs.img of=/dev/mmcblk0p2 bs=4M conv=fsync — no additional unwrap step. Matches what the kexec’d updater does today (per ADR-0011).
rootfs.imgRaw ext4 partition image.Same rationale: directly dd-able to the inactive rootfs slot.
*.sha256One-line sha256sum-format file (<hex> <basename>).Trivially verifiable with stock sha256sum -c; no new tooling on the host side.
manifest.tomlTOML, sorted keys, no comments.Human-inspectable, machine-parseable, Cargo-native. Carries the schema_version so future format bumps are detectable.
.kn86fw update bundleExisting 128-byte header + KN86SLOT inner blob.Reuses the entire existing parse / verify / inspect surface; no second wire format.

Tarballs / ext4 dumps / OTA-specific bundle formats are explicitly NOT chosen. Tarballs require a tar parser on the device side (we don’t have one in the kexec’d updater); ext4 dumps (e2image) have nondeterministic mtimes and aren’t byte-identical run-to-run; OTA bundle formats (Mender, RAUC) are deferred to production per ADR-0011 §Risks #7. The raw-partition + SHA-256 + TOML manifest combination is the lowest-complexity surface that satisfies every consumer.

  1. A/B field update via kn86flash. Operator plugs in cable, the Tauri flasher fetches a .kn86fw from a release URL, calls kn86fw verify (already implemented), then kn86flash writes the unwrapped bootfs.img to the inactive bootfs partition and rootfs.img to the inactive rootfs partition via the elevated helper. /home/shared (p6) is never touched. This is the load-bearing field flow ADR-0011 commits to.
  2. Dev iteration: rebuild rootfs only. A change touching stage-kn86-runtime (a systemd unit, a nOSh binary refresh) only invalidates the rootfs partition. kn86fw split produces a fresh rootfs.img; the developer flashes only that to slot B with dd, reboots into B with tryboot, validates, and either commits or reverts via tryboot rollback. Bootfs partition stays untouched. Iteration cost: ~30 s flash vs. ~5 min full-image flash.
  3. CI artifact splitting. The release CI (.github/workflows/system-image-build.yml) runs tools/sd-provision/build.sh, then kn86fw split, then kn86fw produce-update-bundle, and uploads three release assets per tag: the full .img, the bootfs.img+rootfs.img+manifest.toml triplet, and the .kn86fw. This matches the existing CI contract from system-image-build.md (“uploads the .img and .kn86fw as release assets”) and adds the slot-artifact triplet alongside.
  4. Cartridge-MSC SD-card pipeline. The cartridge-MSC bridge (ADR-0019) doesn’t directly consume slot artifacts, but the same producer is the natural source for the bootstrapping pipeline that prepares a blank cartridge SD via the same dd flow. Out of scope for this pack — flagged for cross-reference.

The producer must emit byte-identical output for byte-identical input. Concretely:

  • bootfs.img and rootfs.img are byte slices of the source .img. The bytes are deterministic by construction (slice of an immutable input). The only nondeterminism risk is the source .img itself, which system-image-build.md notes is reproducible modulo a /etc/kn86-build-id timestamp; that’s out of scope here.
  • *.sha256 is sha2-computed over the partition bytes — deterministic.
  • manifest.toml must use sorted keys, ASCII-only values, no inline comments, and either zero timestamps or an SOURCE_DATE_EPOCH-controlled timestamp (per the existing reproducible-builds convention). Recommend: emit a built_at field only when SOURCE_DATE_EPOCH is set; otherwise omit it.
  • .kn86fw update bundle is deterministic if the inner KN86SLOT blob is deterministic. The inner blob’s section ordering is fixed (bootfs first, rootfs second); the section padding is zero-fill. No source-date-dependent fields go into the inner header.
  • Verification path: a CI step does two builds and asserts byte-identical output of every artifact. This is the primary reproducibility gate; it fails loud if any future code change introduces nondeterminism.

The current tools/sd-provision/build.sh flow is unchanged. After the producer ships, callers that only want the .img keep getting the .img. The new subcommands are additive:

  • tools/sd-provision/build.sh keeps copying the .img to OUTPUT_IMG exactly as today; an optional post-step (gated by KN86_PRODUCE_SLOT_ARTIFACTS=1) invokes kn86fw split and kn86fw produce-update-bundle.
  • kn86fw build (Phase 0 single-file payload contract) is unchanged. Its tests stay green.
  • The .kn86fw outer header (format/kn86fw.h) is unchanged. No format version bump.
  • Adding a KN86SLOT inner format adds a new C header (tools/kn86fw/format/kn86slot.h) — same single-source-of-truth discipline as kn86fw.h.

The only file that changes shape is the README: a new “Wave 2 — slot artifacts” section is added explaining the new subcommands, and the existing “Phase 0 scope” callout is updated to reflect that the slot-producer half has shipped.

Existing tools/kn86fw/src/ modules:

src/
├── main.rs (clap dispatch; +3 subcommands: split, produce-update-bundle, inspect-slot)
├── lib.rs (re-exports; +pub mod split, slot)
├── header.rs (UNCHANGED — outer .kn86fw header)
├── cmd_build.rs (UNCHANGED — Phase 0 single-file payload path)
├── inspect.rs (UNCHANGED — extends only via inspect_slot.rs)
└── verify.rs (UNCHANGED — payload SHA-256 verification)

New modules:

src/
├── partition_table.rs (NEW — minimal MBR/GPT partition-table reader; takes a Read+Seek,
│ returns a Vec<Partition { number, start_lba, length_lba, type_code }>.
│ Hand-rolled, no dep — ADR-0011 layout is fixed.)
├── split.rs (NEW — orchestrates partition-table read → byte slice → write to disk +
│ SHA-256 + manifest. Uses partition_table.rs.)
├── slot.rs (NEW — KN86SLOT inner-format encoder/decoder. Mirrored in
│ format/kn86slot.h with C-side struct + _Static_assert parity tests.)
├── cmd_split.rs (NEW — `kn86fw split` subcommand impl; depends on split.rs.)
├── cmd_produce_bundle.rs (NEW — `kn86fw produce-update-bundle` impl; depends on slot.rs +
│ cmd_build::run for the outer-header step.)
└── inspect_slot.rs (NEW — `kn86fw inspect-slot` impl; pretty-prints manifest + partition stats.)

New format/ files:

format/
├── kn86fw.h (UNCHANGED)
└── kn86slot.h (NEW — KN86SLOT inner-format C header. Same packed-struct + _Static_assert
discipline as kn86fw.h. Sized to exactly N bytes per the locked spec.)

New tests/:

tests/
├── integration.rs (UNCHANGED — Phase 0 tests stay green)
├── integration_split.rs (NEW — split a known-good fixture .img, assert per-partition SHA-256,
│ assert manifest.toml stable bytes.)
├── integration_bundle.rs (NEW — produce + verify roundtrip; assert outer .kn86fw verify passes;
│ assert inner KN86SLOT round-trips through slot.rs.)
└── integration_repro.rs (NEW — build twice from the same fixture, diff every byte.)

New dev-dependencies: none required if partition_table.rs is hand-rolled. (If we adopt the gpt crate, that’s one new dep; the alternative — hand-rolled — is preferred for vendoring discipline and to keep the dependency surface small.)

Acceptance criteria (when the implementation actually runs, post-v0.1)

Section titled “Acceptance criteria (when the implementation actually runs, post-v0.1)”
  1. kn86fw split <kn86-os-vDEV.img> emits the artifact triplet in the documented layout. Each *.img SHA-256 matches the corresponding *.sha256 file.
  2. kn86fw produce-update-bundle --bootfs ... --rootfs ... ... emits a .kn86fw that passes the existing kn86fw verify flow without modification.
  3. kn86fw inspect-slot <dir> prints partition sizes, SHA-256s, source .img origin, and schema_version.
  4. Reproducibility test (integration_repro.rs) passes — two split + produce-update-bundle runs against the same input produce byte-identical outputs at every level.
  5. C/Rust parity testtools/kn86fw/format/kn86slot.h compile-time assertions (_Static_assert) match the Rust struct layout, mirroring the existing kn86fw.h parity test in tests/integration.rs.
  6. tools/sd-provision/build.sh KN86_PRODUCE_SLOT_ARTIFACTS=1 invokes the new producer and lands the artifacts under tools/sd-provision/build/slot-artifacts/. Existing default flow (no env var) unchanged.
  7. docs/device/os/update-system.md gains a “Slot artifacts” subsection naming the producer subcommands and the on-disk artifact layout. Cross-references back to this design pack.
  8. CI workflow (.github/workflows/system-image-build.yml) uploads the slot-artifact triplet alongside the existing .img. (Stage 0 bring-up already enables artifact upload per system-image-build.md; this is an additive matrix entry.)
  1. Source .img is a partial / truncated build. A killed pi-gen leaves a partial .img in deploy/. split must read the partition table first and refuse to extract any partition that runs past the end of the input file, with a clear error citing the partition number, declared range, and actual file length. Don’t silently produce truncated bootfs.img.
  2. Partition table doesn’t match ADR-0011’s six-partition layout. A user feeds kn86fw split an arbitrary .img (e.g., raw Raspberry Pi OS Lite they downloaded). The splitter must validate the table against the ADR-0011 expected shape (6 partitions, expected sizes ±10%, expected filesystem types) and fail loud if it doesn’t match — with a hint pointing at system-image-build.md. Refuse to extract unless the layout matches; don’t fall back to “extract whatever’s in p2 and p4,” because the wrong filesystem type in the wrong slot would brick a device.
  3. Partition images larger than the .kn86fw payload size we want to ship. A rootfs.img is ~2 GB; a full bundle is ~2.25 GB. The desktop flasher uploads this over USB in a few seconds, which is fine for v1, but the produce-update-bundle command should print the resulting bundle size and warn if it crosses (configurable) 3 GB. Compression is out of scope for v1 (the research brief notes the format is “gzipped ext4/FAT,” but ADR-0011’s actually-implemented .kn86fw carries raw bytes; revisit if bundle size becomes a transport problem).
  4. Concurrent split runs writing to the same output dir. The producer must either lock the output dir or refuse to overwrite an existing non-empty directory unless --force is passed. Avoid the manifest.toml from one run getting paired with bootfs.img from another. Standard Rust file-creation semantics + an explicit existence check before write.
  5. A .kn86fw produced by produce-update-bundle is fed to the Phase 0 path. The outer verify must succeed (the SHA-256 covers the whole KN86SLOT blob, which is opaque to the outer header). inspect must print the outer header normally; the user invokes inspect-slot (or eventually a smart auto-detect on inspect) to crack open the inner format. Confirm the existing inspect doesn’t try to parse beyond the header.
  6. Pi firmware files in p1 (the common boot region). The producer extracts only bootfs (p2) and rootfs (p4) of the active slot at split time; p1 (autoboot.txt, bootcode, common stage-1) is never part of an update bundle, because the field-update flow per ADR-0011 doesn’t touch p1 — it only writes the inactive slot’s bootfs + rootfs and rewrites autoboot.txt’s tryboot_a_b line in place. Document this explicitly in the README to head off future “why isn’t p1 in the bundle” questions.
  • Owner when implementation runs: Platform Engineering (Rust + system-image expertise). Single engineer, ~3–5 days end-to-end (split + bundle + tests + docs + CI step). Not a sprint-blocker once unblocked.
  • Files this design pack expects to land:
    • New: tools/kn86fw/src/{partition_table.rs, split.rs, slot.rs, cmd_split.rs, cmd_produce_bundle.rs, inspect_slot.rs} and tools/kn86fw/format/kn86slot.h.
    • Edit: tools/kn86fw/src/{main.rs, lib.rs}, tools/kn86fw/Cargo.toml (deps if any), tools/kn86fw/README.md (Wave 2 section), tools/sd-provision/build.sh (optional post-step), docs/device/os/update-system.md (slot-artifacts subsection), .github/workflows/system-image-build.yml (artifact-upload matrix).
    • Untouched: tools/kn86fw/src/{header.rs, cmd_build.rs, inspect.rs, verify.rs} and tools/kn86fw/format/kn86fw.h.
  • Test strategy: TDD. Start with partition_table.rs against a tiny hand-crafted MBR/GPT fixture; then split.rs against a fixture .img (a 64 MB toy with the ADR-0011 partition table and known-byte-pattern partitions); then slot.rs round-trip; then end-to-end via integration_split.rs + integration_bundle.rs; finally integration_repro.rs.
  • Spec hygiene reminders for the implementation PR:
    • Spec Hygiene Rule 1 — do not restate the ADR-0011 partition layout in partition_table.rs comments or the README. Reference the ADR.
    • Spec Hygiene Rule 3 — when KN86SLOT lands, search the repo for stale “monolithic .img” references in docs and update them in the same PR. Likely candidates: docs/device/os/update-system.md, tools/kn86fw/README.md’s “Phase 0 scope” callout.
  • What this pack is NOT a license for: writing any of the above modules now. Per the GWP-163 gate, design only.
  1. Inner-format wire layout. The KN86SLOT blob’s exact byte layout (field widths, alignment, ordering) is left for the implementation PR. Recommendation: 16-byte header + 56-byte section entries + 4 KiB-aligned section payloads, mirrored in kn86slot.h with _Static_assert. Confirm this is OK to lock at implementation time, or call it now.
  2. Compression. Research brief §“Firmware image format” suggests gzipped slot images; the actually-shipped .kn86fw is raw. Recommendation: stay raw for v1 (~2.25 GB bundles transfer over USB in seconds; compression adds determinism risk). Revisit if a hosted-update channel needs the bandwidth savings. Confirm.
  3. gpt crate vs hand-rolled partition reader. Recommendation: hand-rolled. The ADR-0011 layout is fixed and a ~120-line reader keeps the dependency surface small (no transitive deps, no MSRV churn). Confirm.
  4. Auto-detect inner format on kn86fw inspect. Should inspect peek at the payload’s first 8 bytes and, if they’re KN86SLOT, print slot-aware output by default — or keep inspect-slot as a separate explicit subcommand? Recommendation: keep inspect-slot separate for v1 (less magic, easier to test); revisit when there’s a second inner format to disambiguate.
  5. CI artifact retention. The slot-artifact triplet adds ~2.25 GB per release tag to the private monorepo’s release assets. GitHub’s per-release storage is generous but not infinite. Recommendation: keep the last 5 releases of slot artifacts, prune older. Confirm cadence + retention.
  6. Does the producer also emit a “p1 common boot region” image? Recommendation: no for v1 (p1 isn’t part of any A/B update flow per ADR-0011), but flag for the implementation PR in case a future “rebuild p1 only” workflow surfaces. Confirm.