Skip to content

KN-86 Deckline — PCM Voice Bark System Addendum

Parent Documents:

  • KN-86-Capability-Model-Spec.md (Cipher voice, cartridge flash layout)
  • KN-86-Pi-Zero-Build-Specification.md (PSG hardware, audio subsystem — supersedes the archived Pico-era KN-86-Modern-Build-Specification.md)
  • KN-86-UI-Design-System.md (sound design patterns)
  • KN-86-Cartridge-Grammar-Spec.md (stdlib API, cartridge authoring surface)

The Cipher voice is the KN-86’s narrative engine. It currently communicates exclusively through text rendered to the amber display — procedurally constructed sentences from domain word tables, appearing at mission transitions, debriefs, and critical state changes. The PSG provides tonal punctuation (alert stings, confirmation tones) but no vocal content.

This addendum proposes adding short PCM voice barks to the Cipher voice system. These are pre-recorded speech samples — single words or short phrases, compressed and stored in cartridge flash — played through the YM2149’s amplitude register as a DAC. The reference point is NES-era digitized speech: Double Dribble’s “DOUBLE DRIBBLE!”, Blades of Steel’s “FIGHT!”, Mike Tyson’s Punch-Out’s “BODY BLOW!”. Gritty, lo-fi, unmistakably 8-bit, never more than a yelled phrase.

Why this matters: The Cipher voice is described as a “competent colleague” — terse, clipped, authoritative. Text on amber screen already sells this. But a barked word at a critical moment — “BREACH.” when ICE catches you, “CLEAN.” on a perfect extraction, “TRACED.” when Black Ledger finds the fraud — bridges the gap between reading a terminal message and feeling like someone is actually there. It’s the difference between seeing > CONTACT on screen and hearing the word punched through a 28mm speaker at the same time.

Design constraint: Barks supplement the text voice. They never replace it. The Cipher voice remains primarily textual. Barks fire at high-impact moments only — a few per session, not every screen transition. Overuse kills the effect.


The YM2149 has no PCM playback mode. But each channel’s amplitude register (regs 8, 9, 10) is a 4-bit DAC — writing values 0–15 directly controls the output level. By writing amplitude values at a fixed sample rate, one channel becomes a crude PCM output.

Technique: Commandeer Channel C’s amplitude register. Disable Channel C’s tone generator (mixer bit 2 = 1) and noise (mixer bit 5 = 1) so the channel output is controlled purely by amplitude writes. Feed 4-bit sample values from a buffer into register 10 at the playback sample rate. Channels A and B remain available for tonal accompaniment (a drone, an alert tone underneath the bark).

On the emulator and device: The audio_callback in sound.c already runs at 44.1kHz, calling psg_sample() per sample. The simplest approach: when a bark is active, psg_sample() overrides Channel C’s amplitude with the next sample byte at the bark’s sample rate, using a counter to downsample from 44.1kHz to the bark rate. The CPU cost is one register write per bark sample — negligible on target hardware.

PropertyValueRationale
Bit depth4-bit (packed, 2 samples/byte)Matches YM2149 DAC resolution exactly. No wasted bits.
Sample rate8,000 HzAdequate for intelligible speech. Phone-quality bandwidth (~3.4kHz).
ChannelsMonoSingle speaker output.
CompressionNone (raw 4-bit PCM)At these sizes, compression overhead exceeds savings.
Max duration1.0 second per barkDesign constraint, not technical limit.
Storage per second4,000 bytes8,000 samples/sec × 0.5 bytes/sample = 4KB/sec.
Max bark size4,000 bytes1.0 sec × 4KB/sec.

Add a Bark Table region between the Cipher Domain and Save regions:

RegionOffsetSizeContents
Header0x0000256 bytes(unchanged)
Code0x0100Variable(unchanged)
DataVariableVariable(unchanged)
Templatestemplate_offsettemplate_size(unchanged)
Cipher DomainVariableVariable(unchanged)
Bark Tablebark_offsetbark_sizeNEW: Sample index + packed 4-bit PCM data
SaveVariablesave_size(unchanged)
Provenance Chainchain_offsetRemaining(unchanged)

Bark Table internal structure:

typedef struct {
uint8_t count; /* Number of barks in table (max 16) */
uint8_t reserved[3]; /* Alignment padding */
struct {
char label[8]; /* Bark identifier, e.g., "BREACH\0\0" */
uint16_t offset; /* Byte offset from start of PCM data region */
uint16_t length; /* Length in bytes (packed 4-bit, so samples = length * 2) */
uint16_t sample_rate; /* Playback rate in Hz (typically 8000) */
uint16_t reserved; /* Alignment */
} entries[16]; /* Fixed 16-slot index */
} BarkTableHeader; /* 4 + (16 × 16) = 260 bytes */
/* Followed immediately by packed PCM data */
/* Byte layout: [high_nibble=sample_N | low_nibble=sample_N+1] */

Add two fields to the header, using 8 bytes from the 16-byte reserved region:

uint32_t bark_offset; /* Byte offset where bark table begins (0 = no barks) */
uint32_t bark_size; /* Size of bark table + PCM data */
uint8_t reserved[8]; /* Reduced from 16 to 8 bytes */

A bark_offset of 0 means the cartridge has no barks. Firmware and runtime treat this as a clean opt-out — no behavioral change for existing cartridges.

Barks per cartDuration eachFlash costRAM cost (playback buffer)
8 barks0.5 sec avg~16KB + 260B index = ~16.5KB0 (stream directly from flash)
12 barks0.75 sec avg~36KB + 260B index = ~36.5KB0
16 barks1.0 sec avg~64KB + 260B index = ~64.5KB0

The typical target is 8–12 barks per cartridge at 0.3–0.7 seconds each: roughly 12–34KB per module. On the Pi Zero 2 W device, bark data streams from the cartridge file on the SD filesystem — trivially small against 512 MB RAM and gigabytes of SD storage. No special buffering strategy required; the emulator path is identical.


/* ---- Voice Bark Playback ---- */
/* Play a bark by label. Returns false if label not found or no bark table. */
bool stdlib_bark_play(SystemState *state, const char *label);
/* Play a bark by index (0-15). Returns false if index out of range. */
bool stdlib_bark_play_index(SystemState *state, uint8_t index);
/* Stop any currently playing bark immediately. */
void stdlib_bark_stop(SystemState *state);
/* Is a bark currently playing? */
bool stdlib_bark_active(SystemState *state);
/* PCM bark playback state (inside RuntimeState) */
typedef struct {
const uint8_t *pcm_data; /* Pointer to packed 4-bit PCM data */
uint32_t pcm_length; /* Total bytes of PCM data */
uint32_t pcm_position; /* Current byte position */
uint16_t sample_rate; /* Playback sample rate */
uint16_t rate_counter; /* Fractional counter for sample rate conversion */
bool nibble_high; /* true = read high nibble next, false = low */
bool active; /* true = bark is playing */
} BarkPlayback;

In psg_sample(), when bark_playback.active is true:

  1. Advance rate_counter by bark_sample_rate. When it exceeds PSG_SAMPLE_RATE (44100), consume one nibble and subtract.
  2. Write the nibble value (0–15) directly to Channel C’s amplitude, bypassing the tone/noise/envelope path for that channel.
  3. When pcm_position >= pcm_length, set active = false and restore Channel C to normal PSG operation.

Channels A and B are completely unaffected during bark playback. A cartridge can layer a tone or drone underneath the bark.

/* Declare bark table in cartridge init */
CART_BARKS {
BARK("BREACH", breach_pcm, sizeof(breach_pcm), 8000),
BARK("CLEAN", clean_pcm, sizeof(clean_pcm), 8000),
BARK("BURNED", burned_pcm, sizeof(burned_pcm), 8000),
BARK("TRACED", traced_pcm, sizeof(traced_pcm), 8000),
}
/* In a cell handler: */
CELL_ON_EVAL(network_node) {
if (compromised) {
stdlib_bark_play(g_state, "CLEAN");
/* Text still displays simultaneously */
nosh_print(g_state, 0, 12, "> NETWORK COMPROMISED. NO TRACE.");
}
}

The BARK() macro references a const uint8_t[] array compiled into the cartridge’s data section. A build tool (kn86bark) converts WAV files to packed 4-bit PCM during the cartridge build process.


Each cartridge gets a domain-specific bark palette. These should be single words or two-word phrases, recorded with an authoritative, clipped delivery — like a military radio operator or an air traffic controller. Not conversational. Not friendly. Functional.

ModuleProposed BarksTrigger Context
ICE BreakerBREACH, CLEAN, BURNED, LOCKED, OPEN, TRACE, EXITICE detection, extraction success/fail, node state changes
DepthchargeCONTACT, DEPTH, SURFACE, LAUNCH, HIT, MISS, CLEARSonar events, depth charge outcomes
Black LedgerFRAUD, TRACED, VOID, FLAGGED, CLEAN, AUDITTransaction analysis results, audit outcomes
NeonGridPATROL, CLEAR, BLOCKED, ROUTE, BREACHGuard detection, path validation
Cipher GardenDECRYPT, LOCKED, KEY, MATCH, FAILCipher operations, key verification
nOSh (firmware)READY, LINK, SWAP, COMPLETEBoot, cartridge swap, mission completion
  • Source: Record at 44.1kHz/16-bit WAV, then downsample to 8kHz/4-bit via build tool.
  • Delivery: Short, barked, declarative. No sibilance (S sounds are mud at 4-bit). Prefer hard consonants: B, D, K, T, CH. Avoid words starting with F, S, TH.
  • Processing: Heavy compression (limit dynamic range to fit 4-bit), slight distortion/bitcrush before final encode to lean into the lo-fi aesthetic rather than fighting it.
  • Duration target: 0.3–0.7 seconds per bark. Anything over 0.7 seconds should be cut. The bark should feel like a punch, not a sentence.
  • Tone: Not robotic. Not text-to-speech. A real human voice, crushed through a 4-bit codec, arriving through a 28mm speaker. The degradation is the aesthetic.

Barks occupy Channel C only. The design rules:

  1. Barks preempt Channel C tones. If Channel C was playing a tone, the bark takes over. Channel C tones resume after bark completes (or the cartridge re-triggers them).
  2. Channels A + B continue normally. A sustained alarm drone, a background rhythm, a keyclick SFX — all uninterrupted.
  3. One bark at a time. Triggering a new bark while one is playing stops the current bark and starts the new one. No queuing, no overlapping.
  4. No barks during LAMBDA playback. Macro replay should be silent — barks during fast replay would be cacophonous.
  5. SYS hold abort stops all barks. The emergency exit silences everything.

A command-line tool that converts WAV files to packed 4-bit PCM:

kn86bark input.wav output.pcm [--rate 8000] [--normalize] [--preview]
  • Reads any WAV format (via dr_wav or similar single-header library)
  • Resamples to target rate (default 8000 Hz)
  • Normalizes peak to 4-bit range (0–15)
  • Applies optional dither before quantization
  • Outputs packed nibble format (high nibble first)
  • --preview flag plays back through SDL audio for quick listening

The cartridge CMakeLists.txt gains a step that converts WAV assets to .pcm includes:

# Convert bark WAVs to packed PCM headers
kn86bark_convert(
BARKS
assets/barks/breach.wav
assets/barks/clean.wav
assets/barks/burned.wav
OUTPUT_DIR ${CMAKE_CURRENT_BINARY_DIR}/barks
SAMPLE_RATE 8000
)

This generates breach_pcm.h, clean_pcm.h, etc. — each containing a const uint8_t[] array that the cartridge source includes.


  1. Channel C hijack timing: In the emulator and device, the audio callback runs at 44.1 kHz and calls psg_sample() per sample. Confirm we can inject amplitude values at 8 kHz into Channel C’s path without introducing clicks/pops at bark start/stop boundaries.
  2. Bark file layout: Barks live alongside the cartridge image on SD. Confirm the loader mmap/seek pattern is fast enough for in-mission bark triggering without a stall.
  1. psg_sample() integration: The current function is clean — tone/noise/envelope per channel, mixed and output. Adding a “Channel C override” path needs to be zero-cost when no bark is active (branch prediction should handle this, but confirm).
  2. Cartridge header migration: Adding bark_offset and bark_size consumes 8 bytes of reserved. Existing .kn86 files have reserved[16] with zeros — will the loader handle both v2 (no barks) and v2.1 (with barks) headers cleanly?
  3. Thread safety in audio_callback: The SDL audio callback runs on a separate thread. bark_playback state is written by the main thread (when stdlib_bark_play is called) and read by the audio thread. What synchronization is needed? An atomic flag? A lock? Or is the existing single-writer/single-reader pattern safe enough with a memory barrier?
  1. Bark frequency per session: How many barks per 30-minute session feels right before the novelty wears off? Propose a “bark budget” per mission type (e.g., max 3 barks per single-phase contract, max 6 per multi-phase campaign).
  2. Bark selection determinism: Should bark choice be LFSR-driven (same seed = same bark at same moment) or event-driven (always play “BREACH” on ICE detection regardless of seed)? The former supports the deterministic philosophy; the latter is more intuitive for authors.
  3. Bare deck barks: Should the nOSh runtime have its own bark table for boot, cartridge swap, and mission board events? Or should barks be cartridge-only?
  1. Audio quality acceptance criteria: What’s the minimum intelligibility standard? Can a first-time listener identify the word without seeing the text? Or is “recognizable after hearing it once with text” sufficient (the Double Dribble standard)?
  2. Regression risk: Bark playback touches psg_sample(), which is called 44,100 times per second in the audio callback. What test coverage do we need to ensure existing PSG behavior (tones, noise, envelope) is unaffected when barks are not active?
  3. Cross-platform parity: Bark playback must sound identical on emulator, prototype, and production. What’s the verification strategy?

RiskLikelihoodImpactMitigation
4-bit speech unintelligibleMediumHighRecord test barks early. If words aren’t recognizable, increase to 6kHz sample rate or investigate ADPCM compression for more effective bit depth. Double Dribble proves the concept works, but our speaker/amp chain differs from NES hardware.
Channel C contention with cartridge audioLowMediumDocument that Channel C is reserved during bark playback. Cartridge audio design should use A+B for sustained tones and C for transient SFX that can be interrupted.
Flash budget pressureLowLow12–34KB per cartridge for barks is modest. Monitor during cartridge development. If flash becomes tight, reduce bark count or duration per module.
Bark overuse kills impactMediumMediumEnforce bark budget in design reviews. Gameplay Design agent owns bark trigger criteria. QA agent validates frequency in playtesting.
Audio thread synchronization bugsLowHighUse atomic operations for bark state transitions. Keep the critical section minimal: one atomic flag read per psg_sample() call when inactive.
Scope creep toward full speechLowMediumThis spec explicitly caps barks at 1.0 second and 16 per cartridge. Longer speech, streaming playback, or multi-bark queuing are out of scope. The constraint is the feature.

  1. A recorded “BREACH” bark, played through the emulator’s PSG at 4-bit/8kHz via Channel C amplitude override, is recognizable as the word “breach” to a listener who has heard it once with accompanying text.
  2. During bark playback, Channels A and B continue producing tones and noise without audible artifacts (clicks, pops, pitch glitches).
  3. Bark playback adds zero measurable overhead to psg_sample() when no bark is active.
  4. The kn86bark build tool converts a 44.1kHz WAV to packed 4-bit PCM and the round-trip (record → convert → play in emulator) is completable in under 5 minutes.
  5. ICE Breaker’s on_eval handler can trigger a bark with a single stdlib_bark_play(g_state, "CLEAN") call — no direct PSG register manipulation needed by cartridge authors.