# Next Session — ReadAlong tuning & bug fixes

## State of the feature

ReadAlong is **deployed as a shared asset** consumed by `docs/perspectives-vlm-primary-hybrid-nav-v4.html`. It was ported from `~/workplace/courses/modern-robot-learning-bootcamp` with heavy iteration. The visual design is now a **verbatim port** of courses' ios-loupe-genie prototype. Read **BOTH** of these before touching code, then come back:

1. `~/.claude/plans/readalong-port.md` — the original plan with 54-finding adversarial review (architecture + code + UX reviewers). All HIGH+MED findings addressed.
2. `~/.claude/projects/-home-rajesh-workplace-her-her-os/memory/MEMORY.md` — Session 126 block (search for "Session 126") has the full log of load-bearing UX contracts. Every "Do NOT" in there was the result of a specific bug the user reported — reverting any of them re-introduces that bug.

## Files of interest

| File | Purpose | Size |
|---|---|---|
| `docs/readalong.css` | All styling — loupe, genie, player, modal, toast, book toggle | ~700 lines |
| `docs/readalong.js` | Module — TTSCache, TTSClient, Reader, Loupe, Genie, Player, SettingsModal, Toast, NavDotSync, KeyboardController, WheelController, WordHighlighter, FloatingToggle | ~1400 lines |
| `scripts/port-readalong-v4.py` | BS4 tokenizer; generates v4.html + v4-sections.json from v3.html | ~360 lines |
| `docs/perspectives-vlm-primary-hybrid-nav-v4.html` | Generated output — 2.95 MB, 26 lens sections, ~30k tokenized `.w` spans | — |
| `docs/perspectives-vlm-primary-hybrid-nav-v4-sections.json` | Per-section `{id, title, category, text, words[]}` | ~500 KB |
| `.claude/commands/research_perspectives.md` | Skill spec — rewritten to mandate ReadAlong host-doc contract | — |

## Load-bearing contracts (do NOT regress any of these)

**From Session 126 memory, consolidated. Every item here maps to a bug the user reported.**

### 1. Book icon is a hard ON/OFF toggle — never pause/resume, never play-all
- `FloatingToggle._onClick`: IDLE|SETTINGS|ERROR → `r.play(firstSection)` (no play-all). Any active state → `r.stop()`.
- Shift-click opens Settings modal.
- Pause/resume belongs on the sticky-player ⏸/▶ and the Space key. **Do not** unify these.

### 2. Loupe/genie CSS is verbatim courses — do NOT re-theme
- CSS values are byte-for-byte from `~/workplace/courses/modern-robot-learning-bootcamp/.superpowers/brainstorm/3993652-1776421279/content/ios-loupe-genie.html`.
- Gradient `rgba(255,255,255,0.22→0.06)`, color `#f0a830` literal, font `'DM Sans'`, border `0.5px rgba(255,255,255,0.34)`.
- `.w.active`: `background: rgba(240, 168, 48, 0.22); color: #fff;` — also verbatim.
- Only adaptations for her-os: `position: absolute → fixed` (no doc-level wrap parent) + 14-px lens-inset on genie path (without this the genie overhangs the lens's rounded corners).
- **Do not** add `var(--amber)` or `color-mix()` to `.ra-lens` / `.ra-genie` blocks. The user explicitly rejected re-theming.

### 3. Loupe uses direct `left/top` with transitions — no `transform`
- `.ra-lens` has `transition: left 0.25s cubic-bezier(0.34, 1.25, 0.64, 1), top 0.25s ease, opacity 0.2s`.
- No `transform: translate(...)`. No `scale()` on `.visible`.
- The Genie's rAF loop reads the LIVE `lensEl.getBoundingClientRect()` every frame — that's how the membrane tracks transitions.
- Earlier attempt used `transform: translate(-50%, -100%) scale(0.85)`; the scale transition caused a 3.7 px coord mismatch that the genie formula couldn't predict. Don't add transforms back.

### 4. API key + ra-speed + ra-voice live in `localStorage` (NOT sessionStorage)
- Originally sessionStorage for XSS blast-radius. User reported re-pasting every refresh is blocking UX → reverted to localStorage.
- Security compensation: strict CSP meta tag limits `connect-src` to `texttospeech.googleapis.com`. 127.0.0.1 is trusted by the user.
- Migration from legacy `research-tts-api-key` key to current `readalong-tts-api-key` in `migrateApiKey()`.

### 5. Bad-key recovery: auto-clear + re-open modal
- On auth-error (status 400/401/403 OR message matches "api key"/"permission"/"unauthenticated"), Reader routes to `STATE.SETTINGS` AND auto-clears the rejected key.
- Modal reopens empty (like first time) with coral banner showing Google's actual error.
- **Do not** add a "Clear key" button. **Do not** preserve the rejected key across reject. Both break the no-loop recovery.

### 6. Keyboard + wheel only hijack during active playback
- Space / ← / → / wheel pass through to browser defaults (page-scroll, history nav) when Reader state is IDLE or SETTINGS.
- During PLAYING/PAUSED/LOADING, they're captured. Wheel is scoped to `.readalong-narrative` elements only (NOT global document).
- If a future session makes Space always `preventDefault` "for consistency", it breaks page-scroll on the 2.95 MB v4 doc.

### 7. Arrow/wheel scrubbing speaks the target word
- In PAUSED: seek + one-shot `speakWord()` (user hears the word).
- In PLAYING + narrative target: seek only (audio naturally resumes at that word, no overlap).
- In PLAYING + viz target (no timepoint): pause + speak one-shot (avoids overlap with ongoing narrative audio).

### 8. `.lens-layers` / `.lens-bars` / etc tokenized for click-to-speak
- Port script has a `VIZ_CLASSES` list of visualization block classes.
- Each block gets `.readalong-narrative` class added + all text tokenized into `.w` spans.
- Their words are NOT in the section's `words[]` array (so whole-section TTS reads only prose). Click-to-speak and arrow-scrub see them via DOM.

### 9. Cache persists across refresh — per-voice, per-word, per-section
- IndexedDB DB `readalong-v1`, store `audio`, keyPath `key`.
- Cache key is `SHA-256(voice + "\x00" + text + "\x00" + ssmlFlag)`. Flags: `'0'` = single word, `'1'` = section, `'2'` = voice preview.
- Replaying same section next week = 0 API calls. Replaying same clicked word = 0 API calls.
- Verified by Playwright: `entriesSurvivedReload: 2`, `cachedWordsStillFree: true` post-reload.
- **Do not** add a "confirm clear cache" dialog unless user asks — the existing behavior is intentional.

## MOST RECENT CHANGE (needs verification by next session)

**stepWord now walks every `.w` in the current section, not just narrative words.**

Before: arrow keys navigated via `_chunkTimepoints` (narrative-only, audio-synced). Words in `.lens-layers` etc. were skipped.

After: arrow keys walk DOM-order `.w` spans in the entire section. For narrative words (inside `<div id="narrative-NN">` with `data-idx`) → find timepoint, seek audio. For viz words → pause playback if PLAYING, then `speakWord()` one-shot.

Location: `docs/readalong.js` — the `stepWord(delta)` method on the Reader class. Search for `Walk EVERY .w in the section in DOM order`.

**VERIFY before continuing fresh work:**

```bash
# Serve v4 (or reuse existing server on :8767)
python3 -m http.server 8767 --bind 127.0.0.1 &
```

Then open Chrome, click 📖 (or a lens play button), enter your Google Cloud TTS API key in the settings modal, press Space to pause, then use ← / → arrow keys to scrub. You should:

1. Hear each word as you arrow through narrative paragraphs (already verified working prior to this change).
2. **Hear each viz word as you arrow into `.lens-layers` / `.lens-bars` blocks.** (THIS IS THE NEW BEHAVIOR TO VERIFY.)
3. Observe the loupe move to each word (including viz words).
4. In PLAYING state, arrowing into a viz word should pause playback first, then speak the viz word.

If any of these misbehave, fix them BEFORE touching other work.

## Open follow-ups (explicit TODOs from the plan + session)

| # | Item | Path/location |
|---|---|---|
| 1 | Write Jest + pytest + Playwright tests (coverage ≥80% JS, ≥90% port.py, 1 E2E smoke) | `tests/readalong/` (empty dir) |
| 2 | Retrofit v1/v2/v3 with ReadAlong | `docs/perspectives-vlm-primary-hybrid-nav-v{1,2,3}.html` — v1 has NO `.version-links` div (different DOM era), may need port script flag |
| 3 | Retrofit `day-in-life-*.html` with ReadAlong | `docs/day-in-life-annie.html`, `docs/day-in-life-rajesh.html` — scene-aware tokenizer, different DOM structure |
| 4 | Apply CSP + sessionStorage migration meta-tags to catalog page | `docs/research-perspectives-catalog.html` — does NOT need ReadAlong itself, just the security headers |
| 5 | iPad/touch gestures | Currently keyboard+wheel only. Touch-swipe-next, long-press-to-speak would be courses-parity. |
| 6 | Seek via progress-bar drag (currently click-only) | `Player._wireEvents` — add pointerdown/move/up for drag |
| 7 | Commit everything to git | Nothing has been committed this whole session. `git status` will show a lot of unstaged changes. |

## Typical next-session invocation

```bash
# 1. Read the plan + memory to ground yourself
cat ~/.claude/plans/readalong-port.md
cat ~/.claude/projects/-home-rajesh-workplace-her-her-os/memory/MEMORY.md | grep -A 200 "Session 126"

# 2. Start local HTTP server if not running
ss -tlnp | grep :8767 || python3 -m http.server 8767 --bind 127.0.0.1 &

# 3. Verify MOST RECENT CHANGE (stepWord into viz blocks) works, per the
#    "VERIFY before continuing fresh work" section above.

# 4. Ask the user what they want to iterate on next. Common patterns:
#    - "X is broken" → reproduce in Playwright, then diagnose
#    - "Can we do Y?" → check it against the load-bearing contracts first,
#      then implement
#    - "Port X to the other docs" → see TODO #2/#3 above
```

## If you must re-engineer something

The reference implementation in courses **is the source of truth for loupe/genie behavior**. If behavior diverges, do not debug my port — **re-copy from courses** and adapt only for the two allowed adaptations (position:fixed, 14-px inset). Every time this session tried to "improve" over courses, the user rejected it and I had to revert.

Read:
- `~/workplace/courses/modern-robot-learning-bootcamp/.superpowers/brainstorm/3993652-1776421279/content/ios-loupe-genie.html` — loupe + genie reference
- `~/workplace/courses/modern-robot-learning-bootcamp/eureka-system-core/docs-templates/js/read-along.js` — ScrubController + WordSpeaker reference (~3000 lines)

## Verification checklist before claiming any fix is done

Playwright (via `mcp__plugin_playwright_playwright__*` tools) is the verification harness:

1. `browser_navigate` to v4 URL.
2. `browser_evaluate` — manipulate state, query DOM, measure rects, stub `fetch`.
3. `browser_take_screenshot` — only for visual regressions user might report.

Example measurement pattern (coord accuracy):
```js
const computed = ra._loupe.computedLensRect;  // from your formula
const actual = ra._loupe.el.getBoundingClientRect();
const delta = Math.abs(computed.top - actual.top);
// expect delta < 1 (sub-pixel noise OK)
```

Don't claim "it works" from visual inspection alone. Measure.