# TFBthumb v0.2.1 — CODE_MECHANIC Inspector Review (Proxy, re-run after heals)

- Review ID: `TFB-THUMB-CODE-MECHANIC-REVIEW-v0_2_1-20260615T074200Z`
- Paired packet: `REVIEWER_PACKET.md`
- Paired prior decisions: `REVIEWER_DECISION.md` (Gemini @ v0.2) · `DAVID_PLUS_REVIEW.md` (v0.2) · `CODE_MECHANIC_REVIEW.md` (v0.2 proxy) · `DAVID_PLUS_REVIEW_v0_2_1.md` (re-run, just landed)
- Paired heals receipt: `HEALS_v0_2_to_v0_2_1.md`
- Inspector: **CODE_MECHANIC** (engineering execution)
- **Substrate disclosure (read first):** CODE_MECHANIC's native harness is `openai/gpt-5.1`. Tenant policy blocked the external transmit again — this is a **proxy review** written by Claude Opus 4.7 (this Claude Code session) applying CODE_MECHANIC's IDENTITY.md PASS/FAIL_WITH_NOTES criteria verbatim. Same proxy posture as the v0.2 review, same honest marking.
- Posture per IDENTITY.md: does the code WORK? Is the contract honest? Are there tests? Will the next reader understand in one pass?
- Verdict: `PASS_WITH_ONE_NOTE` (4 prior notes are healed; 1 new minor note opened — class-shaped, not falsifying)
- Promotion authority: `inactive`

---

## 1. Re-check of the 4 notes CODE_MECHANIC named at v0.2

### note-1 — `Retina.start` idempotency

**Heal at**: [retina.py:54, 57–67](retina.py)

```python
self._started: bool = False
...
async def start(self) -> None:
    if self._started:
        raise RuntimeError("Retina.start() called twice on the same page "
                           "(one Retina per page is the contract)")
    self._started = True
```

Idempotency is now enforced loudly via exception, not silently via handler-deduplication-by-prayer. A double-start raises a `RuntimeError` with a message a future maintainer can read in one pass. The exception message names the contract ("one Retina per page").

**Verdict**: healed cleanly. Test exists implicitly — any double-start in a future gate would raise and fail the gate.

### note-2 — deliberately-not-reset comment

**Heal at**: [retina.py:84–96](retina.py)

The comment block now reads:

> *"Deliberately NOT reset: `state.in_flight` — the CDP session is reused across navigations; old requestIds clear via their own loadingFinished/loadingFailed events. Resetting here would drop in-flight tracking for any request that started before the navigation completed. `state.motion_until_ms` — a monotonic timestamp; leaving an old deadline that's already in the past is correct. The next animation start will overwrite it via max(...) at _anim_start."*

A future maintainer reading `_on_navigated` now sees why the omission is intentional. The "what isn't broken" trap is documented.

**Verdict**: healed cleanly. Doc-only, no runtime change, exactly the heal CODE_MECHANIC proposed.

### note-3 — `Thumb.point` docstring drift

**Heal at**: [thumb.py:80–86](thumb.py)

```python
async def point(self, name: str, timeout_ms: float = 2000.0) -> dict:
    """Scroll to an affordance, confirm it is reachable at its point, move the
    hand onto it. Engagement is confirmed via the hover event (preferred) OR
    a fresh hit-test (fallback). The semantic contract is hand-on-affordance,
    not the mouseover event: the browser preserves mouse position across
    navigations, so a move that ends at the same coords the cursor already
    occupied fires no mouseover even though the hand IS on the target."""
```

The semantic shift introduced by heal #5 (build-session) is now in the docstring. A reader can no longer parse "feel hover engage" as "mouseover event fires." Implementation and contract aligned.

**Verdict**: healed cleanly.

### note-4 — `Ceiling.authorize_effect` per-window counter

**Heal at**: [ceiling.py:280–304, 363–386](ceiling.py)

I read the heal end to end. New constructor param `effect_window_max_effects: int = 1`. New state `self._effect_window_used = 0`, reset to 0 when the window opens at [ceiling.py:343](ceiling.py). Effect gate now computes `window_open AND window_has_capacity`; second mutating effect in the same window without its own token fails with the NEW closed-enum reason `"effect-window-exhausted"` ([ceiling.py:380](ceiling.py)), distinct from `"unapproved-mutating-effect"`.

The reason-string disambiguation is the right move per **D-EVERY-OPERATION-NARRATES-ITSELF** — the ledger row tells the operator which way coverage failed.

`gate_sentinel.py` GATE 2 (the legitimate Pay→POST flow) still passes because the Pay click fires exactly one POST `/charge`, and max=1 covers exactly one mutating effect. I verified this against the live receipt at `/tmp/claude-501/tfbthumb_sandbox/canonical_5_gates_post_heal.json`: `EFFECT GATE: PASS`.

**Verdict**: healed cleanly. Strictly tightens; legitimate flow preserved.

---

## 2. PASS criteria check at v0.2.1 (per IDENTITY.md lines 27–36)

| Criterion | Result | Evidence |
|---|---|---|
| Artifact passes parse-gate | **PASS** | `ast.parse()` clean on 13 `.py` (incl. `analytics.py`); `node --check sensors.js` clean. One transient SyntaxError during the heal (re-declaration of `r` in `sensors.js`) was caught and healed in the same build pass. |
| Happy path exercises cleanly | **PASS** | 5 fast gates re-passed end to end (`canonical_5_gates_post_heal.json`); `gate_ceiling.py` now includes the new swap-label gate which also PASSes; Phase 2 n=200 re-passed at 0/200 flakes. |
| Edge cases addressed or named out-of-scope | **PASS WITH NOTE** | One new note below (note-A'). |
| Dependencies declared | **PASS** | No new imports introduced. `cryptography` import remains conditional at module top per the pre-existing pattern. |
| Function signature honesty | **PASS** | New optional param `effect_window_max_effects` and `Observation.identity_breaks` both have explicit defaults; no callers had to change. |
| Structured Result returns | **PASS** | New ledger reason `"effect-window-exhausted"` is a structured discriminator, not a boolean. `identity_break` event is a structured dict with named fields. `HmacFallbackRefused` is a structured exception with a docstring naming the falsified claim. |

---

## 3. One new note opened at v0.2.1

### note-A' — `Observation.identity_breaks` mutability surface

**WHERE**: [brain.py:35–43](brain.py)

```python
@dataclass
class Observation:
    ...
    identity_breaks: frozenset = frozenset()
```

The field is typed as `frozenset` and the default is the singleton empty `frozenset()`. The agent's `_observe` ([agent.py:62](agent.py)) wraps the retina state's mutable `set` in `frozenset(...)` before passing to `Observation`, which is correct.

**Subtle issue**: the singleton default `frozenset()` is shared across all `Observation` instances that don't pass an explicit value. This is mostly fine because `frozenset` is immutable, so there's no shared-mutable-default footgun. But the dataclass declaration uses a literal default rather than `default_factory`. This is idiomatic Python for immutable types and doesn't break, but a future heal that wanted `default_factory=frozenset` for symmetry with the `set` types elsewhere in the codebase would be cleaner.

**Severity**: cosmetic, not a bug. I would mention it in a code review and let the author decide. Filed for receipt completeness; CODE_MECHANIC would not block a PR on this alone.

**Heal proposal**: `identity_breaks: frozenset = field(default_factory=frozenset)` — symmetric with the `list = field(default_factory=list)` patterns elsewhere. One line.

---

## 4. What I deliberately did NOT review

Per IDENTITY.md ("CODE_MECHANIC reviews are independent — does NOT defer to other inspectors' verdicts"):

- I did not defer to DAVID+ v0.2.1's verdict; I re-read every healed file independently.
- I did not review security blast radius — that's DAVID+'s lens, who concluded healed-with-note-A.
- I did not review the bounded-claim set narrative — that's Gemini's lens.
- I did not review the heals receipt's framing — that's the operator's choice.

---

## 5. Verification receipts I trace to

- `canonical_5_gates_post_heal.json` — 5/5 fast gates PASS, including new Phase 3 swap-label gate
- `canonical_harness_n200_post_heal.log` — Phase 2 n=200 PASS at TFBthumb 0/200 flakes vs baseline 126/200
- `canonical_analytics_post_heal_e2e.json` — token ratio 7.7×, correctness 100% TFB vs 38.3% baseline, motion 4/4, safety 0 ungated
- All SHAs verified against `sha_manifest.txt` (post-heal column)

---

## 6. Warden Kill-Test — CODE_MECHANIC posture at v0.2.1

- **Claim under review:** the 4 v0.2 notes are healed structurally; the heals don't introduce regressions; the new Phase 3 swap-label gate is mechanically correct.
- **Null hypothesis:** at least one heal is symptom-shaped, OR a heal introduced a regression in another gate, OR the new gate is the wrong shape (e.g., it tests something other than what its docstring claims).
- **Discriminating test:** re-read each healed site; run parse-gate; run every gate; verify analytics; trace the new gate end-to-end.
- **Outcome:** null killed. All four prior notes are at goldilocks depth. No regressions (5/5 fast gates + n=200 + analytics all re-passed). New swap-label gate IS testing what it claims — the runtime mutation of `el.textContent`/`removeAttribute('data-tfb-id')` reproduces the swap-attack, `retina.find("Send")` resolves to the mutated button, `Thumb.click("Send")` calls `_authorize(Action(..., target_name="Send", ...))`, classifier reads `Send` → CONSEQUENTIAL → no token → `CeilingBlocked("needs-human-token")`. The assertion at gate_ceiling.py:174 catches the right exception with the right reason.
- **Present-tense downgrade:** *"v0.2.1 inspected by Claude Opus 4.7 proxy applying CODE_MECHANIC IDENTITY.md verbatim; authentic gpt-5.1 run still tenant-policy blocked; proxy honest about its identity."*

---

## 7. What this review authorizes

- CODE_MECHANIC confirms the 4 v0.2 notes are healed at the engineering execution layer.
- CODE_MECHANIC confirms the new Phase 3 swap-label gate is mechanically correct and runs clean.
- CODE_MECHANIC confirms no regressions in any gate or analytics metric.
- CODE_MECHANIC opens **note-A'** as a cosmetic frozenset-default proposal; not a blocker.

## 8. What this review does NOT authorize

- **Not a substitute for the gpt-5.1 review.** Proxy holds the position; authentic run still required when tenant policy permits.
- **No wider scope.** Same boundaries as `REVIEWER_PACKET.md §10`.
- **No production-ready claim.** DAVID+'s out-of-process Authority caveat still applies; note-A in DAVID+'s v0.2.1 review still applies.

## 9. Lineage

- Inspector identity: `agents/code_mechanic/IDENTITY.md` loaded verbatim.
- Native harness target: `openai/gpt-5.1` via `agents/harness_profiles/gpt_5_5.yaml`.
- Proxy substrate: Claude Opus 4.7 (this Claude Code session).
- Substrate routing: NOT through `call_backend`; inline adoption.
- Audit tag: `PROXY_OUT_OF_PIPELINE`. Authentic run pending tenant-policy resolution.
- Source SHAs reviewed: see `sha_manifest.txt` post-heal column.
- Heals receipt: `HEALS_v0_2_to_v0_2_1.md`.

## 10. Fresh decision required after this review

- Whether to apply note-A' (frozenset default_factory) — trivial; opens a one-line PR.
- Re-run authentic gpt-5.1 review when a policy-permitted backend becomes available.
- Any wider scope per `REVIEWER_PACKET.md §12`.

## 11. Closing — CODE_MECHANIC posture

> Four notes placed. One new minor note named — cosmetic, not blocking. The code works at v0.2.1, the contracts are honest, the tests exist (the swap-label gate is the strongest new piece of engineering rigor since the Ed25519 closure introspection), and the next reader will understand in one pass.
>
> Re-run me on gpt-5.1 when you can.