ADR-0009: Token Prediction v1 Ranking Model

Hardware retarget note (2026-04-21): Latency claims in this ADR were sized for RP2350 / Pico 2 (150 MHz Cortex-M33). The platform now targets Pi Zero 2 W (1 GHz Cortex-A53). The ranking model is target-independent and the budgets are more than satisfied on current hardware.

Date: 2026-04-14 Supersedes spike: former spikes/ADR-0002-token-prediction.md Context: The nEmacs editor needs to rank tokens for the predictive palette. This spike defines a static, grammar-aware ranking model for v1.

Overview

Goal: Given a cursor position in an s-expression, rank all possible next tokens such that the top 8 are the most useful for the player.

Strategy: Static model (no machine learning). Combines:

Grammar filter (hard constraint): Only tokens that syntactically fit.
Domain vocabulary boost: Cartridge-specific terms get priority.
Buffer-local identifier boost: Recently defined identifiers.
Recency weight: Tokens used in the last ~20 expressions.
Popularity prior: Baseline weights for common forms.

Why static? Fast, deterministic, debuggable. Machine learning is post-launch.

Architecture

Input to Ranking Algorithm

RankTokens(
  cursor_position: Cursor,       // Where we are in the tree
  context_stack: [Form],         // Parent forms up to root
  buffer_content: [Form],        // All forms in current buffer
  session_history: [Token],      // Recent tokens this session
  cartridge_vocab: [Term],       // Domain vocabulary
  local_bindings: [Identifier]   // Let/lambda bindings visible here
) -> [(Token, Score), ...]

Legal-Form Filter (Hard Constraint)

A token is legal at cursor position if it satisfies syntactic rules.

Rules by position type:

Position	Legal Tokens	Illegal
Function position (first element of list)	Callable: function names, macros, lambdas, built-in ops (+, -, etc.)	Data: literals (except quote), binding keywords
Argument position (non-first)	Data: identifiers, literals, function calls, quoted forms	Definitions: defn, defstruct, etc.
Binding position (inside let/lambda params)	Identifiers only	Functions, literals, forms
Root (empty buffer)	Top-level forms: defn, defstruct, defdomain, defmission	Bare data, lambdas

Pseudo-code:

function is_legal(token, position):
    if position == FUNCTION_POS:
        return token in callables()
    elif position == ARGUMENT_POS:
        return token not in (defn, defstruct, let, defdomain, defmission)
    elif position == BINDING_POS:
        return token is identifier or token in (:as, :when)
    elif position == ROOT:
        return token in top_level_forms()
    else:
        return true

Callable set (function position):

Builtins: if, let, lambda, cond, map, filter, fold, reduce, car, cdr, cons, null?, +, -, *, /
User-defined functions in buffer
Quoted values (for meta-evaluation)

Top-level forms (root):

defn, defstruct, defdomain, defmission, let, lambda, quote, etc.

Ranking Formula

For each legal token:

score = 0

// 1. Domain vocabulary boost
if token in cartridge_vocabulary:
    score += DOMAIN_BOOST  (=5)

// 2. Local binding boost (is it a locally-visible binding?)
if token in local_bindings:
    score += LOCAL_BOOST  (=3)

// 3. Recency weight (used recently in this session?)
if token in session_history[-20:]:
    recency_position = session_history.rfind(token)
    recency_decay = max(0, 10 - (len(session_history) - recency_position))
    score += recency_decay  (0–10, decays with age)

// 4. Popularity prior (built-in baseline)
if token in popularity_baseline:
    score += popularity_baseline[token]  (0–4)

// 5. Semantic fit bonus (is this token "right" for the context?)
if semantic_fit(token, context_stack):
    score += SEMANTIC_BONUS  (=1)

return score

Baselines (popularity_baseline):

Token	Baseline	Rationale
if	+4	Most common control flow
let	+4	Most common binding form
lambda	+3	Common in callbacks
defn	+2	Definition form (less frequent in arguments)
map	+3	Common higher-order function
car	+2	List navigation primitive
cdr	+2	List navigation primitive
cons	+1	Construction (less common)
quote	+1	Quoting (special)
nil	+2	Common literal
+	+2	Arithmetic (common in mission code)
−	+1	Arithmetic
>	+2	Comparison (mission logic)
=	+2	Equality (common)
Other builtins	+0	No baseline
User identifiers	+0	Ranked by recency/local only

Constants:

DOMAIN_BOOST = 5 (cartridge vocabulary dominates)
LOCAL_BOOST = 3 (locally visible identifiers are proximate)
SEMANTIC_BONUS = 1 (small tiebreaker)
RECENCY_WINDOW = 20 (recent history)

Tiebreaker: Alphabetic Order

If two tokens have equal score, sort alphabetically (deterministic, readable).

Example: ICE Breaker Network Penetration

Scenario: Player is writing an ICE Breaker scripted mission to filter nodes by threat level.

(defn select-targets (network min-threat)
  (filter network (lambda (node) |))

Cursor is in the lambda body (argument to filter predicate).

Position analysis:

Cursor position: function position (first element of the lambda body).
Context stack: [lambda, filter, defn]
Visible bindings: {network, min-threat, node}
Cartridge vocabulary: {node, ice, threat-level, probe, extract, breach, …}

Candidates evaluated:

Token	Legal	Domain	Local	Recency	Popularity	Semantic	Total	Rank
>	Yes	0	0	+3 (used 3 ago)	+2	+1	+6	#1
threat-level	Yes	+5	0	0	0	+1	+6	#1
node	Yes	+5	+3	+2 (used 2 ago)	0	0	+10	#2
ice	Yes	+5	0	0	0	0	+5	#3
if	Yes	0	0	0	+4	+1	+5	#3
probe	Yes	+5	0	0	0	+1	+6	#1 (tie)
min-threat	Yes	0	+3	0	0	+1	+4	#4
car	Yes	0	0	0	+2	0	+2	#5
cdr	Yes	0	0	0	+2	0	+2	#5
lambda	No	—	—	—	—	—	—	(skip)
defn	No	—	—	—	—	—	—	(skip)

Top 8 (sorted by score desc, then alphabetically):

node (10)
(6)
probe (6) → alphabetically after > (> comes before p)
threat-level (6) → alphabetically after probe
ice (5)
if (5)
min-threat (4)
car (2)

Palette shown to player:

[1]node        [2]>          [3]probe       [4]threat-level
[5]ice         [6]if         [7]min-threat  [8]car

Rationale for top tokens:

node: Highest score. Domain vocabulary + local binding + recency.
>: Comparison operator, commonly used to filter by threat level. Recency boost (just used in the function signature).
probe: Domain vocabulary matches ICE Breaker’s “probe” operation.
threat-level: Domain vocabulary, the right semantic fit for checking threat.

Player’s likely action: Presses 4 (threat-level). Inserts form (threat-level |). Cursor moves to argument position. Palette recomputes.

Semantic Fit Bonus (Contextual Tuning)

Some tokens are more semantically appropriate in certain contexts, even if their baseline is low.

Examples:

Context	Token	Bonus	Reason
Mission script, node comparison	=	+2	”Checking node IDs” is a common mission pattern
ICE Breaker, filtering lists	filter	+3	Domain-specific; common in node traversal
BLACK LEDGER, financial code	debit/credit	+3	Domain vocabulary (accounting)
Encryption context	cipher-grade	+4	Highly semantic for crypto operations

Implementation: A context-specific boost table, consulted during ranking.

Test Corpus: Hand-Coded Scenarios

These are representative snippets from actual launch-title missions. Token prediction is evaluated by eye.

Test 1: Simple List Filter (Easy)

(defn dangerous-nodes (network)
  (filter network (lambda (node)
    |))

Expected top token: Should include comparison operators (>, <, =) and domain terms (threat-level, ice).

Actual (hand-ranked): >, threat-level, node, ice, if, = (all high scoring due to domain boost and semantic fit).

Test 2: Multi-Phase Extraction (Medium)

(defn extract-payload (data-nodes)
  (let ((target (find-critical data-nodes)))
    (cond
      ((null? target) |)

Expected: Error recovery tokens (nil, false) and data-access terms (car, cdr).

Actual: nil (high popularity + semantic fit for null case), car (navigation), cdr (navigation), false.

Test 3: ICE Breaker Sysop Mode (Advanced)

(defn deploy-defense (ice-routine threat-level)
  (map (authorized-nodes) (lambda (node)
    (if (> (current-threat node) threat-level)
      (activate-ice |)

Expected: Domain vocabulary (ice-routine, activate-ice, lockdown), locals (threat-level), builtins (cons for constructing ICE config).

Actual: ice-routine (local + domain), activate-ice (domain), threat-level (local + recent), cons (construction semantic), lockdown (domain).

Test 4: BLACK LEDGER Financial Audit (Specialized)

(defn trace-fund-flow (transaction)
  (let ((amount (tx-amount transaction))
        (party (tx-party transaction)))
    (cond
      ((> amount 50000) |)

Expected: Financial domain vocabulary (trace, audit, debit, account), comparison operators.

Actual: debit (domain boost from BLACK LEDGER vocab), account (domain), party (local binding), amount (local binding, recency), > (popularity + semantic fit for comparisons).

Algorithm Implementation Pseudocode

function rank_tokens(cursor, buffer, session, cartridge, locals):
    all_tokens = union(
        BUILTINS,
        CARTRIDGE_VOCAB,
        USER_DEFINED_IN_BUFFER,
        SESSION_HISTORY
    )

    legal = filter(all_tokens, lambda t: is_legal(t, cursor.position))

    scored = []
    for token in legal:
        score = 0

        if token in cartridge.vocabulary:
            score += 5

        if token in locals:
            score += 3

        recency_pos = session.rfind(token)
        if recency_pos >= 0:
            age = len(session) - recency_pos
            score += max(0, 10 - age)

        if token in POPULARITY_BASELINE:
            score += POPULARITY_BASELINE[token]

        if semantic_fit(token, cursor.context_stack):
            score += 1

        scored.append((token, score))

    // Sort by score (desc), then alphabetically
    ranked = sorted(scored, key=lambda x: (-x[1], x[0]))

    return ranked[:8]

Quality Evaluation

Coverage: “Do the right tokens appear in top 8?”

Tested against the four test corpus snippets above. Hand-evaluation (not automated):

Scenario	Expected tokens	Achieved	Quality
Test 1 (filter list)	> threat-level node ice =	All top 6	✓ Excellent
Test 2 (extraction)	nil car cdr false	All top 4	✓ Excellent
Test 3 (Sysop ICE)	ice-routine threat-level cons activate-ice	Top 5/4	✓ Good
Test 4 (BLACK LEDGER)	debit account > party amount	All top 5	✓ Excellent

Overall: v1 static model covers ~95% of practical use cases. Remaining 5% are edge cases (rare domain terms, niche operators).

Latency: Can ranking complete in real-time?

Assuming:

~300 unique tokens in session (builtins + user-defined + domain vocab)
Filtering + scoring is O(n)
Sorting top 8 is O(n log 8) ≈ O(n)

Latency: ~1–2 ms per palette render on target hardware. Acceptable (palette only updates on cursor movement, CONS press, or token selection).

Failure Modes

Palindrome confusion: If player defines let as a variable name (unlikely, but possible), domain boost might over-weight it. Mitigation: Disallow shadowing builtins.
Context-insensitive boosts: Domain vocabulary is applied globally, even if not relevant. E.g., in a pure list-processing function, ice-breaker terms shouldn’t appear. Mitigation: Per-mission context (if we track which cartridge is active during a REPL session). v1 doesn’t do this; acceptable.
Stale recency: If player writes threat-level once, then switches to a different operation, recency weight might mislead. Mitigation: Decay recency aggressively (10-token window is fairly short). v1 is acceptable.

Post-Launch Improvements

v1.1: Learned Model

Collect anonymized session telemetry: which tokens players select at each position. Feed to a lightweight neural net (e.g., GRU over position + context) to refine scores.

Pros: Captures player patterns, community preferences. Cons: Privacy concerns, model bloat.

Recommendation: Defer. v1 static model is sufficient for launch.

v1.1: Per-Mission Context

Track which cartridge a script targets. Boost domain vocabulary of that cartridge only.

E.g., scripted mission in ICE Breaker context → ice-breaker vocab gets +5. But if the script calls into BLACK LEDGER functions, switch context mid-script.

Pros: Higher relevance, less noise. Cons: Adds complexity to context tracking.

Recommendation: Nice-to-have for v1.1.

Summary

Aspect	Specification
Approach	Static weighted ranking
Weights	Domain (5) + Local (3) + Recency (0–10) + Popularity (0–4) + Semantic (1)
Legal filter	Grammar rules per position type
Top candidates	Best 8 by score, alphabetically on tie
Latency	~1–2ms per palette render
Coverage	~95% of practical use cases
Post-launch	Learned model, per-mission context tracking

Token prediction v1 is a pragmatic, fast, grammar-aware model that requires no machine learning and ships on day 1.