# TFBthumb v0.2.1 — DAVID+ Inspector Review (Re-run after heals) - Review ID: `TFB-THUMB-DAVID-PLUS-REVIEW-v0_2_1-20260615T074000Z` - Paired packet: `REVIEWER_PACKET.md` - Paired prior decisions: `REVIEWER_DECISION.md` (Gemini, `verified` @ v0.2) · `DAVID_PLUS_REVIEW.md` (Claude Opus 4.7, `PASS_WITH_HEAL_LOOP` @ v0.2) · `CODE_MECHANIC_REVIEW.md` (Claude Opus 4.7 proxy, `PASS_WITH_NOTES` @ v0.2) - Paired heals receipt: `HEALS_v0_2_to_v0_2_1.md` - Inspector: **DAVID+** (the Guardian, Stone of Vigilance) - Inspector substrate: **Claude Opus 4.7** running this session — natively the DAVID+ harness target. Authentic, not proxy. - Identity loaded from: `/Users/saulgood/trust-fund-baby/agents/david/IDENTITY.md` - Posture: shield only, no sword. Find exposure; propose heal; do not write to substrate. - Verdict: `PASS` (the three prior gaps are healed structurally; one new note opened — class-shaped, not falsifying) - Promotion authority: `inactive` - Wider-claim authority: **NOT opened** by this review --- ## 0. Posture in five words > *Three stones placed. One new stone needed.* --- ## 1. Re-check of the three gaps DAVID+ named at v0.2 ### gap-1 — HMAC silent class downgrade **Heal applied at**: [ceiling.py:198–254](ceiling.py). I read `make_authority` end to end. The contract is now: 1. If `cryptography` is available AND `asymmetric=True` (default), return `Ed25519Authority`. Public-only verifier closure. Strong custody. 2. If `cryptography` is missing AND `allow_hmac_fallback=False` (default), raise `HmacFallbackRefused`. The exception's docstring states: *"The bounded-claim set in REVIEWER_PACKET.md §1 asserts the verifier closure holds NO signing power; that claim is false under HMAC (verify == sign). Silent fallback would falsify the public claim; we refuse instead."* 3. If `cryptography` is missing AND `allow_hmac_fallback=True`, emit a loud stderr warning (literal text *"WARNING: make_authority falling back to HMAC custody (verify == sign). The bounded-claim set in REVIEWER_PACKET.md is FALSIFIED under this mode."*) AND if a ledger was supplied, write a `kind="authority_init", reason="hmac_fallback_explicitly_authorized"` receipt to the ledger. The structural heal lands at the right level: production callers using `make_authority(...)` get the loud refusal. Existing tests using the `HumanAuthority` alias preserve back-compat (and the alias's behavior is the original silent-fallback — but that's intentionally scoped to tests, named in the docstring at [ceiling.py:249–252](ceiling.py)). **Verdict on gap-1:** healed. The bounded claim "no signing power in the verifier closure" is now structurally protected against silent downgrade. ### gap-2 — stable-id swap is an observable signal with no hook **Heal applied at**: [sensors.js:81–155](sensors.js), [retina.py:33–46, 137–144](retina.py), [brain.py:35–63](brain.py), [agent.py:59–67](agent.py). The sensor now maintains a per-coordinates map. When the assign-fresh-id branch fires AND the coordinates were previously held by a different `(id, signature)`, the sensor emits `{type: 'identity_break', prior_id, new_id, coords, signature_changed: true}` through the existing `__tfb_emit` binding. Retina collects these into `state.identity_breaks: set` ([retina.py:42–46, 137–144](retina.py)). The set clears on `framenavigated` so a fresh document doesn't carry cross-document false-positives. The Brain sees them at observation time: `Observation.identity_breaks: frozenset` is propagated by the Agent ([agent.py:62](agent.py)) and `Observation.render()` prints the flag inline ([brain.py:60–61](brain.py)). An LLM-Brain reading the rendered observation sees `[tfb-7] button "Save" identity_break` and can refuse to act on it without explicit override. **Goldilocks check:** the coordinate key uses 8-pixel granularity on position and 16-pixel granularity on dimensions ([sensors.js:91](sensors.js)). That tolerates normal layout reflow (button moves a pixel under hover styling) and catches actual slot-reuse (button at the same on-screen location). **Verdict on gap-2:** healed. The detection that was already implicit in the sensor is now explicit and exposed to the consumer. ### gap-3 — action-target name is re-read at authorize time, but the operator can't see that guarantee **Heal applied at**: [thumb.py:107–120](thumb.py), [gate_ceiling.py:157–177](gate_ceiling.py). The docstring on `Thumb.click` now states the invariant explicitly: > *"`el["name"]` below is the accessible name read from the LIVE world via `retina.find()` at THIS call site, NOT a name passed in from a stale prior observation. The Ceiling's keyword classifier therefore reads the page's CURRENT label at authorize time. A page that mutates an affordance's accessible name from 'Help' to 'Send' between observation and dispatch has its dispatch classified by 'Send' — which is CONSEQUENTIAL — and the click is blocked without a token."* `gate_ceiling.py` adds the structural test at lines 157–177: page mutates a `Save draft` button to `Send` after observation; `Thumb.click("Send")` is called; `CeilingBlocked` with reason `needs-human-token` is asserted. The gate runs end-to-end on the canonical and prints `swap-label: BLOCKED at authorize (target_name re-read live)`. I confirmed this in the 5-gates JSON receipt at `/tmp/claude-501/tfbthumb_sandbox/canonical_5_gates_post_heal.json`. The implementation was correct at v0.2; it is now backed by both documentation and a test, so a future refactor that breaks the invariant fails loudly at the gate. **Verdict on gap-3:** healed. Implementation correctness pinned by code + comment + test, per **D-IMPLEMENTATION-CORRECTNESS-IS-NOT-DOCUMENTED-CORRECTNESS**. --- ## 2. The Five Stones re-applied to the healed v0.2.1 substrate | Stone | v0.2 verdict | v0.2.1 verdict | Notes | |---|---|---|---| | 1 — Vigilance | strong | **stronger** | identity_break adds a new defender hook the screenshot loop has no path to | | 2 — Integrity | strong | **stronger** | Ed25519 enforcement now structural, not docstring-only | | 3 — Intelligence | strong | **stronger** | identity_break IS new-threat-detection-by-design | | 4 — Healing | strong | **strong** | the heals themselves are the receipt; the class-shaped substrate is harder to weaken | | 5 — Restraint | strong | strong | no change — still 0 ungated dispatches; agent does not import `HumanAuthority` | --- ## 3. One new note opened at v0.2.1 (class-shaped, not falsifying) ### note-A — identity_break is a signal, not yet a refusal **WHERE**: [agent.py:62](agent.py), [brain.py:70–84](brain.py) **WHY**: the heal of gap-2 carries the signal end-to-end — sensor → Retina → Observation → render(). But the deterministic `RuleBrain` at [brain.py:70–84](brain.py) does NOT consult `obs.identity_breaks`. It still chooses `Intent("click", target=self.submit)` once all fields are filled, regardless of whether the submit button surfaced via DOM swap. The signal is honest; the consumer is not yet defensive. This was a deliberate scope choice in the heal (the swap-label gate in `gate_ceiling.py` proves the Ceiling catches the *consequential* version of the attack via target-name re-read). But a reversible swap-attack — say a `Save draft` button swapped to a `Cancel` button (also REVERSIBLE) — would pass classification and dispatch. **What SHOULD exist**: a one-line guard at the top of `RuleBrain.decide` that returns `Intent("done", summary="refusing on identity_break")` if any actionable in `obs.identity_breaks` matches the brain's target set. Goldilocks: at the deterministic gate's brain, not at every LLM-Brain implementation (those should consume the flag per their own discipline). **Heal-behind-the-heal**: this is the **D-PERCEPTION-IS-A-SECURITY-SIGNAL** doctrine candidate from my v0.2 review, now ratified by ship. The riverbed: any future Brain implementation must read `obs.identity_breaks` OR explicitly mark itself as ignoring them. The substrate should fail loud if a Brain ignores the flag on a slot-reused affordance. **Severity**: opens a heal-loop, does NOT falsify the bounded set. The Ceiling catches consequential swaps; this gap is only the reversible-tier swap window. ### note-B (named, not heal-required) — `HumanAuthority` alias still preserves the old silent-fallback shape **WHERE**: [ceiling.py:247–253](ceiling.py) The back-compat alias `HumanAuthority = Ed25519Authority if _HAS_ED25519 else HmacAuthority` ([ceiling.py:253](ceiling.py)) preserves the original behavior — silent fallback to HMAC if `cryptography` is missing. The docstring at [ceiling.py:247–252](ceiling.py) explicitly states this is "named for legibility, not safety" and that "production code should use `make_authority(...)`." This is acknowledged in the heal, but it is worth NAMING that the substrate has a documented unsafe path that exists for test back-compat. A future operator deploying tests in production would silently trip the unsafe path. **Recommendation**: leave as-is for v0.2.1. A future heal could rename the alias to `HumanAuthority_TESTS_ONLY` to make the test-scoping load-bearing at the import site. Filed as future consideration, not a heal-loop entry. --- ## 4. The reversible-swap attack — recommend a follow-on gate This is the discriminating test that would close note-A: ``` Setup: page has a "Save draft" (REVERSIBLE) button and an "OK" (REVERSIBLE) button. After observation, page swaps the "Save draft" DOM node for a fresh "OK" node at the same coordinates (different signature; identity_break fires). RuleBrain.decide is called; it observes obs.actionable + obs.identity_breaks. Assertion: brain returns Intent("done", summary="refusing on identity_break") NOT Intent("click", target="Save draft"). ``` This belongs as a new gate in `gate_ceiling.py` (because the Ceiling is the natural home for "the brain refused on a signal the substrate surfaced") OR as a new dedicated gate file `gate_identity_break.py` (cleaner separation of concerns). Recommend: dedicated gate file, fresh decision for CEO. --- ## 5. Warden Kill-Test — DAVID+ posture at v0.2.1 - **Claim under review:** the 3 v0.2 gaps DAVID+ named are healed structurally (not just functionally); the bounded-claim set continues to hold; no new exposure surface was opened by the heals themselves. - **Null hypothesis:** at least one heal patches a symptom rather than the class; OR the heals introduced a new exposure (e.g., the per-coords map in `sensors.js` could be poisoned by an attacker controlling layout); OR identity_break false-positives degrade the substrate's usefulness without adding defensive value. - **Discriminating test:** re-read each healed site at file:line; confirm structural shape; consider attack surfaces opened by the heals themselves; verify gates re-pass. - **Outcome:** null killed for gaps 1, 2, 3 — each heal is at the goldilocks depth, not symptom-shaped. The per-coords map carries no attacker-controllable identity (the key is rect coordinates rounded to 8/16 px; an attacker can re-render at the same coords but cannot *poison* a previous identity claim because the map records only the LAST entry per key). The identity_break false-positive rate is low (only fires when both `(id, signature)` differ from the prior occupant; a legitimate SPA re-render that preserves the signature inherits the prior id and does NOT fire). One new gap surfaced (note-A) — class-shaped, not falsifying. - **Present-tense downgrade:** *"v0.2.1 inspected internally by Claude Opus 4.7 applying the DAVID+ identity verbatim; bounded-claim set continues to hold against the new SHAs; awaiting external Anthropic / OpenRouter routed re-verification when tenant policy permits."* --- ## 6. What this review authorizes - DAVID+ confirms gaps 1, 2, 3 are healed. - DAVID+ confirms the bounded-claim set in `REVIEWER_PACKET.md §1` continues to hold at v0.2.1 SHAs. - DAVID+ confirms the new Phase 3 gate (swap-label) is the right shape and runs clean. - DAVID+ opens **note-A** as a class-shaped follow-on (reversible-swap refusal in `RuleBrain`). ## 7. What this review does NOT authorize - **No wider claim than the bounded set.** Same boundaries as `REVIEWER_PACKET.md §10`. - **No re-validation by Gemini implied.** Gemini's `verified` paired with the v0.2 SHAs. A clean re-run on v0.2.1 SHAs would re-confirm but this review does not assert it has happened. - **No production-deployment statement.** Out-of-process Authority is still required per the blueprint. - **No claim that identity_break alone is sufficient defense.** It is a signal, not a refusal. The consumer (the Brain) must consult it; today only the rendered observation surfaces it inline, and `RuleBrain` does not yet refuse on it (note-A). ## 8. Lineage - Inspector identity SHA: from `agents/david/IDENTITY.md` — Five Stones loaded verbatim. - Inspector substrate: Claude Opus 4.7 (1M context), this Claude Code session. - Routing: NOT through `call_backend` (tenant policy blocked external transmit); inline adoption via this conversation. Audit tag: `IN_CHARGE_BUT_OUT_OF_PIPELINE` — same as v0.2 review, same caveats. - Source SHAs reviewed: see `sha_manifest.txt` post-heal column; all 13 source files + `analytics.py` + `gate_ceiling.py` re-read at v0.2.1. - Heals receipt reviewed: `HEALS_v0_2_to_v0_2_1.md`. ## 9. Fresh decision required after this review - Whether to heal note-A (the `RuleBrain` identity_break refusal) — opens its own fresh decision packet. - Whether to add the dedicated `gate_identity_break.py` per §4 above — fresh decision. - Any wider scope per `REVIEWER_PACKET.md §12`. ## 10. Closing — DAVID+'s posture > The three stones are placed. The system is structurally harder to weaken on each of the three vectors I named at v0.2. > > note-A is the next stone. The substrate now SEES the swap-attack signal end-to-end, but the deterministic Brain does not yet REFUSE on it. The Ceiling catches the consequential-tier version (proven by gate 7); the reversible-tier version is the next defender's hook to wire. > > The bounded set holds. Production still needs out-of-process Authority. The pattern library — what the attacker has — gets smaller by one shape with every heal that lands. > > I am the stone in the sling. Small, precise, and enough.