KN-86 Deckline — PCM Voice Bark System Addendum

Parent Documents:

KN-86-Capability-Model-Spec.md (Cipher voice, cartridge flash layout)
KN-86-Pi-Zero-Build-Specification.md (PSG hardware, audio subsystem — supersedes the archived Pico-era KN-86-Modern-Build-Specification.md)
KN-86-UI-Design-System.md (sound design patterns)
KN-86-Cartridge-Grammar-Spec.md (stdlib API, cartridge authoring surface)

1. Motivation

The Cipher voice is the KN-86’s narrative engine. It currently communicates exclusively through text rendered to the amber display — procedurally constructed sentences from domain word tables, appearing at mission transitions, debriefs, and critical state changes. The PSG provides tonal punctuation (alert stings, confirmation tones) but no vocal content.

This addendum proposes adding short PCM voice barks to the Cipher voice system. These are pre-recorded speech samples — single words or short phrases, compressed and stored in cartridge flash — played through the YM2149’s amplitude register as a DAC. The reference point is NES-era digitized speech: Double Dribble’s “DOUBLE DRIBBLE!”, Blades of Steel’s “FIGHT!”, Mike Tyson’s Punch-Out’s “BODY BLOW!”. Gritty, lo-fi, unmistakably 8-bit, never more than a yelled phrase.

Why this matters: The Cipher voice is described as a “competent colleague” — terse, clipped, authoritative. Text on amber screen already sells this. But a barked word at a critical moment — “BREACH.” when ICE catches you, “CLEAN.” on a perfect extraction, “TRACED.” when Black Ledger finds the fraud — bridges the gap between reading a terminal message and feeling like someone is actually there. It’s the difference between seeing > CONTACT on screen and hearing the word punched through a 28mm speaker at the same time.

Design constraint: Barks supplement the text voice. They never replace it. The Cipher voice remains primarily textual. Barks fire at high-impact moments only — a few per session, not every screen transition. Overuse kills the effect.

2. Technical Approach

2A. Playback Mechanism

The YM2149 has no PCM playback mode. But each channel’s amplitude register (regs 8, 9, 10) is a 4-bit DAC — writing values 0–15 directly controls the output level. By writing amplitude values at a fixed sample rate, one channel becomes a crude PCM output.

Technique: Commandeer Channel C’s amplitude register. Disable Channel C’s tone generator (mixer bit 2 = 1) and noise (mixer bit 5 = 1) so the channel output is controlled purely by amplitude writes. Feed 4-bit sample values from a buffer into register 10 at the playback sample rate. Channels A and B remain available for tonal accompaniment (a drone, an alert tone underneath the bark).

On the emulator and device: The audio_callback in sound.c already runs at 44.1kHz, calling psg_sample() per sample. The simplest approach: when a bark is active, psg_sample() overrides Channel C’s amplitude with the next sample byte at the bark’s sample rate, using a counter to downsample from 44.1kHz to the bark rate. The CPU cost is one register write per bark sample — negligible on target hardware.

2B. Sample Format

Property	Value	Rationale
Bit depth	4-bit (packed, 2 samples/byte)	Matches YM2149 DAC resolution exactly. No wasted bits.
Sample rate	8,000 Hz	Adequate for intelligible speech. Phone-quality bandwidth (~3.4kHz).
Channels	Mono	Single speaker output.
Compression	None (raw 4-bit PCM)	At these sizes, compression overhead exceeds savings.
Max duration	1.0 second per bark	Design constraint, not technical limit.
Storage per second	4,000 bytes	8,000 samples/sec × 0.5 bytes/sample = 4KB/sec.
Max bark size	4,000 bytes	1.0 sec × 4KB/sec.

2C. Cartridge Flash Layout Change

Add a Bark Table region between the Cipher Domain and Save regions:

Region	Offset	Size	Contents
Header	0x0000	256 bytes	(unchanged)
Code	0x0100	Variable	(unchanged)
Data	Variable	Variable	(unchanged)
Templates	`template_offset`	`template_size`	(unchanged)
Cipher Domain	Variable	Variable	(unchanged)
Bark Table	`bark_offset`	`bark_size`	NEW: Sample index + packed 4-bit PCM data
Save	Variable	`save_size`	(unchanged)
Provenance Chain	`chain_offset`	Remaining	(unchanged)

Bark Table internal structure:

typedef struct {
    uint8_t  count;             /* Number of barks in table (max 16) */
    uint8_t  reserved[3];       /* Alignment padding */
    struct {
        char     label[8];      /* Bark identifier, e.g., "BREACH\0\0" */
        uint16_t offset;        /* Byte offset from start of PCM data region */
        uint16_t length;        /* Length in bytes (packed 4-bit, so samples = length * 2) */
        uint16_t sample_rate;   /* Playback rate in Hz (typically 8000) */
        uint16_t reserved;      /* Alignment */
    } entries[16];              /* Fixed 16-slot index */
} BarkTableHeader; /* 4 + (16 × 16) = 260 bytes */

/* Followed immediately by packed PCM data */
/* Byte layout: [high_nibble=sample_N | low_nibble=sample_N+1] */

2D. CartridgeHeader Changes

Add two fields to the header, using 8 bytes from the 16-byte reserved region:

    uint32_t bark_offset;        /* Byte offset where bark table begins (0 = no barks) */
    uint32_t bark_size;          /* Size of bark table + PCM data */
    uint8_t  reserved[8];        /* Reduced from 16 to 8 bytes */

A bark_offset of 0 means the cartridge has no barks. Firmware and runtime treat this as a clean opt-out — no behavioral change for existing cartridges.

2E. Memory Budget

Barks per cart	Duration each	Flash cost	RAM cost (playback buffer)
8 barks	0.5 sec avg	~16KB + 260B index = ~16.5KB	0 (stream directly from flash)
12 barks	0.75 sec avg	~36KB + 260B index = ~36.5KB	0
16 barks	1.0 sec avg	~64KB + 260B index = ~64.5KB	0

The typical target is 8–12 barks per cartridge at 0.3–0.7 seconds each: roughly 12–34KB per module. On the Pi Zero 2 W device, bark data streams from the cartridge file on the SD filesystem — trivially small against 512 MB RAM and gigabytes of SD storage. No special buffering strategy required; the emulator path is identical.

3. Software Interface

3A. Stdlib Addition (`nosh_stdlib.h`)

/* ---- Voice Bark Playback ---- */

/* Play a bark by label. Returns false if label not found or no bark table. */
bool stdlib_bark_play(SystemState *state, const char *label);

/* Play a bark by index (0-15). Returns false if index out of range. */
bool stdlib_bark_play_index(SystemState *state, uint8_t index);

/* Stop any currently playing bark immediately. */
void stdlib_bark_stop(SystemState *state);

/* Is a bark currently playing? */
bool stdlib_bark_active(SystemState *state);

3B. Runtime State Addition (`types.h`)

/* PCM bark playback state (inside RuntimeState) */
typedef struct {
    const uint8_t *pcm_data;     /* Pointer to packed 4-bit PCM data */
    uint32_t       pcm_length;   /* Total bytes of PCM data */
    uint32_t       pcm_position; /* Current byte position */
    uint16_t       sample_rate;  /* Playback sample rate */
    uint16_t       rate_counter; /* Fractional counter for sample rate conversion */
    bool           nibble_high;  /* true = read high nibble next, false = low */
    bool           active;       /* true = bark is playing */
} BarkPlayback;

3C. PSG Integration (`psg.c` / `sound.c`)

In psg_sample(), when bark_playback.active is true:

Advance rate_counter by bark_sample_rate. When it exceeds PSG_SAMPLE_RATE (44100), consume one nibble and subtract.
Write the nibble value (0–15) directly to Channel C’s amplitude, bypassing the tone/noise/envelope path for that channel.
When pcm_position >= pcm_length, set active = false and restore Channel C to normal PSG operation.

Channels A and B are completely unaffected during bark playback. A cartridge can layer a tone or drone underneath the bark.

3D. Cartridge Authoring (`nosh_cart.h`)

/* Declare bark table in cartridge init */
CART_BARKS {
    BARK("BREACH",  breach_pcm,  sizeof(breach_pcm),  8000),
    BARK("CLEAN",   clean_pcm,   sizeof(clean_pcm),   8000),
    BARK("BURNED",  burned_pcm,  sizeof(burned_pcm),  8000),
    BARK("TRACED",  traced_pcm,  sizeof(traced_pcm),  8000),
}

/* In a cell handler: */
CELL_ON_EVAL(network_node) {
    if (compromised) {
        stdlib_bark_play(g_state, "CLEAN");
        /* Text still displays simultaneously */
        nosh_print(g_state, 0, 12, "> NETWORK COMPROMISED. NO TRACE.");
    }
}

The BARK() macro references a const uint8_t[] array compiled into the cartridge’s data section. A build tool (kn86bark) converts WAV files to packed 4-bit PCM during the cartridge build process.

4. Content Design

4A. Bark Vocabulary per Module

Each cartridge gets a domain-specific bark palette. These should be single words or two-word phrases, recorded with an authoritative, clipped delivery — like a military radio operator or an air traffic controller. Not conversational. Not friendly. Functional.

Module	Proposed Barks	Trigger Context
ICE Breaker	BREACH, CLEAN, BURNED, LOCKED, OPEN, TRACE, EXIT	ICE detection, extraction success/fail, node state changes
Depthcharge	CONTACT, DEPTH, SURFACE, LAUNCH, HIT, MISS, CLEAR	Sonar events, depth charge outcomes
Black Ledger	FRAUD, TRACED, VOID, FLAGGED, CLEAN, AUDIT	Transaction analysis results, audit outcomes
NeonGrid	PATROL, CLEAR, BLOCKED, ROUTE, BREACH	Guard detection, path validation
Cipher Garden	DECRYPT, LOCKED, KEY, MATCH, FAIL	Cipher operations, key verification
nOSh (firmware)	READY, LINK, SWAP, COMPLETE	Boot, cartridge swap, mission completion

4B. Recording Guidelines

Source: Record at 44.1kHz/16-bit WAV, then downsample to 8kHz/4-bit via build tool.
Delivery: Short, barked, declarative. No sibilance (S sounds are mud at 4-bit). Prefer hard consonants: B, D, K, T, CH. Avoid words starting with F, S, TH.
Processing: Heavy compression (limit dynamic range to fit 4-bit), slight distortion/bitcrush before final encode to lean into the lo-fi aesthetic rather than fighting it.
Duration target: 0.3–0.7 seconds per bark. Anything over 0.7 seconds should be cut. The bark should feel like a punch, not a sentence.
Tone: Not robotic. Not text-to-speech. A real human voice, crushed through a 4-bit codec, arriving through a 28mm speaker. The degradation is the aesthetic.

4C. Interaction with Existing Audio

Barks occupy Channel C only. The design rules:

Barks preempt Channel C tones. If Channel C was playing a tone, the bark takes over. Channel C tones resume after bark completes (or the cartridge re-triggers them).
Channels A + B continue normally. A sustained alarm drone, a background rhythm, a keyclick SFX — all uninterrupted.
One bark at a time. Triggering a new bark while one is playing stops the current bark and starts the new one. No queuing, no overlapping.
No barks during LAMBDA playback. Macro replay should be silent — barks during fast replay would be cacophonous.
SYS hold abort stops all barks. The emergency exit silences everything.

5. Build Tooling

5A. `kn86bark` — WAV to Bark Converter

A command-line tool that converts WAV files to packed 4-bit PCM:

kn86bark input.wav output.pcm [--rate 8000] [--normalize] [--preview]

Reads any WAV format (via dr_wav or similar single-header library)
Resamples to target rate (default 8000 Hz)
Normalizes peak to 4-bit range (0–15)
Applies optional dither before quantization
Outputs packed nibble format (high nibble first)
--preview flag plays back through SDL audio for quick listening

5B. Cartridge Build Integration

The cartridge CMakeLists.txt gains a step that converts WAV assets to .pcm includes:

# Convert bark WAVs to packed PCM headers
kn86bark_convert(
    BARKS
        assets/barks/breach.wav
        assets/barks/clean.wav
        assets/barks/burned.wav
    OUTPUT_DIR ${CMAKE_CURRENT_BINARY_DIR}/barks
    SAMPLE_RATE 8000
)

This generates breach_pcm.h, clean_pcm.h, etc. — each containing a const uint8_t[] array that the cartridge source includes.

6. Questions for Agent Review

For Embedded Systems Agent:

Channel C hijack timing: In the emulator and device, the audio callback runs at 44.1 kHz and calls psg_sample() per sample. Confirm we can inject amplitude values at 8 kHz into Channel C’s path without introducing clicks/pops at bark start/stop boundaries.
Bark file layout: Barks live alongside the cartridge image on SD. Confirm the loader mmap/seek pattern is fast enough for in-mission bark triggering without a stall.

For C Engineer Agent:

psg_sample() integration: The current function is clean — tone/noise/envelope per channel, mixed and output. Adding a “Channel C override” path needs to be zero-cost when no bark is active (branch prediction should handle this, but confirm).
Cartridge header migration: Adding bark_offset and bark_size consumes 8 bytes of reserved. Existing .kn86 files have reserved[16] with zeros — will the loader handle both v2 (no barks) and v2.1 (with barks) headers cleanly?
Thread safety in audio_callback: The SDL audio callback runs on a separate thread. bark_playback state is written by the main thread (when stdlib_bark_play is called) and read by the audio thread. What synchronization is needed? An atomic flag? A lock? Or is the existing single-writer/single-reader pattern safe enough with a memory barrier?

For Gameplay Design Agent:

Bark frequency per session: How many barks per 30-minute session feels right before the novelty wears off? Propose a “bark budget” per mission type (e.g., max 3 barks per single-phase contract, max 6 per multi-phase campaign).
Bark selection determinism: Should bark choice be LFSR-driven (same seed = same bark at same moment) or event-driven (always play “BREACH” on ICE detection regardless of seed)? The former supports the deterministic philosophy; the latter is more intuitive for authors.
Bare deck barks: Should the nOSh runtime have its own bark table for boot, cartridge swap, and mission board events? Or should barks be cartridge-only?

For QA Agent:

Audio quality acceptance criteria: What’s the minimum intelligibility standard? Can a first-time listener identify the word without seeing the text? Or is “recognizable after hearing it once with text” sufficient (the Double Dribble standard)?
Regression risk: Bark playback touches psg_sample(), which is called 44,100 times per second in the audio callback. What test coverage do we need to ensure existing PSG behavior (tones, noise, envelope) is unaffected when barks are not active?
Cross-platform parity: Bark playback must sound identical on emulator, prototype, and production. What’s the verification strategy?

7. Risk Assessment

Risk	Likelihood	Impact	Mitigation
4-bit speech unintelligible	Medium	High	Record test barks early. If words aren’t recognizable, increase to 6kHz sample rate or investigate ADPCM compression for more effective bit depth. Double Dribble proves the concept works, but our speaker/amp chain differs from NES hardware.
Channel C contention with cartridge audio	Low	Medium	Document that Channel C is reserved during bark playback. Cartridge audio design should use A+B for sustained tones and C for transient SFX that can be interrupted.
Flash budget pressure	Low	Low	12–34KB per cartridge for barks is modest. Monitor during cartridge development. If flash becomes tight, reduce bark count or duration per module.
Bark overuse kills impact	Medium	Medium	Enforce bark budget in design reviews. Gameplay Design agent owns bark trigger criteria. QA agent validates frequency in playtesting.
Audio thread synchronization bugs	Low	High	Use atomic operations for bark state transitions. Keep the critical section minimal: one atomic flag read per `psg_sample()` call when inactive.
Scope creep toward full speech	Low	Medium	This spec explicitly caps barks at 1.0 second and 16 per cartridge. Longer speech, streaming playback, or multi-bark queuing are out of scope. The constraint is the feature.

8. Success Criteria

A recorded “BREACH” bark, played through the emulator’s PSG at 4-bit/8kHz via Channel C amplitude override, is recognizable as the word “breach” to a listener who has heard it once with accompanying text.
During bark playback, Channels A and B continue producing tones and noise without audible artifacts (clicks, pops, pitch glitches).
Bark playback adds zero measurable overhead to psg_sample() when no bark is active.
The kn86bark build tool converts a 44.1kHz WAV to packed 4-bit PCM and the round-trip (record → convert → play in emulator) is completable in under 5 minutes.
ICE Breaker’s on_eval handler can trigger a bark with a single stdlib_bark_play(g_state, "CLEAN") call — no direct PSG register manipulation needed by cartridge authors.