# Next Session: ReadAlong Port — Implementation

## What
Port the ReadAlong feature (loupe + genie + reader + TTS cache + click-to-speak) from `~/workplace/courses/modern-robot-learning-bootcamp` into the her-os `/research_perspectives` skill. Ship as shared external assets (`docs/readalong.{css,js}`). Produce v4 of the VLM-nav perspectives doc as the first consumer.

## Plan
**Read the plan FIRST — it's load-bearing:** `~/.claude/plans/readalong-port.md`.

It contains:
- Stage 0 Known Gotchas (prior session lessons)
- Stage 1 file deliverables + architecture
- Stage 1B revised state machine (v2 — with LOADING_PAUSED)
- Stage 1C pre-mortem (41 scenarios with mitigations)
- Stage 1D scope coherence (sibling surfaces classified)
- Stage 6 review feedback response table (54 findings, all HIGH+MED addressed)
- Stage 7 final plan V2

**Do not re-plan.** All adversarial findings (3 reviewers, 24 HIGH + 23 MED + 6 LOW) are already incorporated in plan v2. Implement the plan as written.

## Current State (as of session interrupt)

| File | State |
|---|---|
| `docs/readalong.css` | ✅ written + all 12 CSS fixes from review applied |
| `docs/readalong.js` | ⏳ NOT YET WRITTEN (this is the main work) |
| `scripts/port-readalong-v4.py` | ⏳ NOT YET WRITTEN |
| `.claude/commands/research_perspectives.md` TTS section | ⏳ NOT YET REWRITTEN |
| `docs/perspectives-vlm-primary-hybrid-nav-v4.html` | ⏳ NOT YET GENERATED |
| `docs/perspectives-vlm-primary-hybrid-nav-v4-sections.json` | ⏳ NOT YET GENERATED |
| v1/v2/v3 version-links patches | ⏳ NOT YET APPLIED |
| Jest + pytest + Playwright tests | ⏳ NOT YET WRITTEN |
| Chrome verification | ⏳ NOT YET RUN |

## Key Design Decisions (from adversarial review — DO NOT REVERT)

1. **WebAudio API for chunk chaining** (`AudioBufferSourceNode`), NOT sequential `<audio>` elements. Fixes gapless playback, chunk duration source, and playbackRate drift in one stroke.
2. **Chunk SSML by ASSEMBLED-BODY byte count** (incl. `<mark>` overhead), NOT by source-text bytes. ~180 words/chunk practical under 4500-byte target.
3. **SessionStorage + strict CSP meta tag** for API key, not localStorage. Migration path: copy from localStorage on first run.
4. **Generation counter `_playGen`** on EVERY async callback, including `audio.onended` and chunk-chain. Prevents cross-section audio bleeding.
5. **LOADING_PAUSED state** — pause pressed during load doesn't nullref or drop silently; synth continues, audio enters PAUSED on completion.
6. **Port script strips v3's legacy TTS infrastructure** (inline `<script>` globals, `.tts-modal`, `.tts-bar`, `onclick="ttsPlay"` attrs) before injecting new assets. Without this, two TTS systems fight over the same buttons.
7. **BS4 `.children` tree-walk tokenization** — NOT `.string` or `.get_text()`. Skip-list: `{"code", "a", "img", "br"}` get emitted whole.
8. **Lens-17 has no `.tts-btn` in v3** — port script iterates ALL `<section class="lens">` elements and injects buttons; never scrapes from pre-existing buttons.
9. **Code spans (`<code>`) not tokenized** — emitted as unsplit XML-escaped runs in SSML. Word-highlight advances past them as a group.
10. **Font-weight NOT changed on `.w.active`** — uses `text-shadow` instead to avoid per-word paragraph reflow.
11. **aria-hidden NOT used** on narrative root — breaks keyboard click-to-speak. Use `aria-live="off"` + `tabindex="-1"` default + Settings toggle for focus-mode.
12. **Shift-modifier keyboard shortcuts only** (Shift+Space, Shift+←/→, Esc). Plain Space MUST remain browser page-scroll.
13. **WCAG AA contrast** required for `.w.active` in BOTH themes — verified via axe-core in Playwright.
14. **Play-all mode + voice audition + loading indicator + onboarding toast** are all IN-SCOPE, not nice-to-haves.

## Files to Modify

In order (each depends on the prior being stable):

1. **`docs/readalong.js`** (NEW, ~1500 lines) — per Plan §1.2-v2. All classes: TTSCache, TTSClient, Reader, Loupe, Genie, Player, SettingsModal, Toast, NavDotSync, KeyboardController. Top-level `window.ReadAlong` with bootstrap on `DOMContentLoaded`.
2. **`scripts/port-readalong-v4.py`** (NEW) — per Plan §1.3-v2 port contract. BS4 tokenizer, v3-stripping, lens-17 injection, CSP meta tag insertion, section.json emission. CLI flags `--input`, `--output`, `--sections-json`, `--section-selector`, `--narrative-selector`.
3. **`.claude/commands/research_perspectives.md`** lines 591–623 — rewrite TTS section to mandate ReadAlong assets in every future doc.
4. **Tests** (`tests/readalong/js/*.test.js`, `tests/readalong/python/test_port_script.py`, `tests/readalong/e2e/e2e-readalong.spec.ts`).
5. **Run port script** against `docs/perspectives-vlm-primary-hybrid-nav-v3.html` → generates v4.
6. **Patch v1/v2/v3** `.version-links` divs — add `v4` link; flip v3's `.version-current` span to a plain `<a>`.
7. **Chrome verification** per Plan §Verification steps 1–17.

## Start Command

```bash
# 1. Read the plan
cat ~/.claude/plans/readalong-port.md

# 2. Read the already-written CSS to ground the JS contract
cat ~/workplace/her/her-os/docs/readalong.css

# 3. Read the v3 HTML to understand the DOM the JS + port script target
head -c 8000 ~/workplace/her/her-os/docs/perspectives-vlm-primary-hybrid-nav-v3.html
```

Then implement files in the order above. All adversarial findings are already folded into the plan — trust the plan, don't re-litigate decisions.

## Verification

1. Jest suite: `npm run test:readalong` — ≥80% line coverage on readalong.js.
2. pytest: `pytest tests/readalong/python/ -v --cov=scripts.port_readalong_v4 --cov-report=term` — ≥90% coverage.
3. Playwright: `npx playwright test tests/readalong/e2e/` — mocked TTS, light+dark themes, keyboard flow.
4. Port script idempotency: `python scripts/port-readalong-v4.py <args>; mv v4.html v4a.html; python scripts/port-readalong-v4.py <args>; diff v4a.html v4.html` → empty.
5. Start local HTTP server on port 8767 (or reuse if still running): `python3 -m http.server 8767 --bind 127.0.0.1 &`.
6. Open `http://127.0.0.1:8767/docs/perspectives-vlm-primary-hybrid-nav-v4.html` in Chrome via MCP.
7. Per Plan §Verification: click lens-01 play → settings modal opens → user enters API key → audio plays → words highlight in sync → loupe follows → genie renders.
8. Replay lens-01 → cache hit (zero new `texttospeech.googleapis.com` requests in devtools Network panel).
9. Click any `.w` span → that word speaks.
10. Toggle theme (dark↔light) → `.w.active` still meets WCAG AA (verify with axe-core extension or `npx @axe-core/cli`).
11. Resize viewport to 375 px → loupe+genie hide, player reflows, voice/speed dropdowns disappear from player (visible only in Settings).
12. Keyboard: Shift+Space toggles play/pause; Esc stops; Shift+←/→ moves section; Tab lands on focused `.w` only if "Focus word mode" enabled in Settings.
13. DevTools Performance recording during playback → CLS <0.01 (no per-word layout shift from font-weight change).
14. Lens-17 plays correctly (verifies port script's missing-button injection).
15. Patch v1/v2/v3: browser-verify that each of `v1.html`, `v2.html`, `v3.html` now has a clickable `v4 (Apr ??)` entry in their `.version-links`.

## Deploy / Merge

- After Chrome verification passes, commit `docs/readalong.{css,js}`, `scripts/port-readalong-v4.py`, new v4 files, patched v1/v2/v3, skill command update, tests.
- Update MEMORY.md last-session entry.
- Update `docs/PROJECT.md` with ReadAlong entry under a new "Shared UI Assets" section.
- Follow-up docs to write:
  - `docs/NEXT-SESSION-READALONG-RETROFIT.md` — retrofit ReadAlong into v1/v2/v3 (verify DOM variance first per plan arch MED-10).
  - `docs/NEXT-SESSION-READALONG-NARRATIVE-PAGES.md` — retrofit into day-in-life narrative pages with scene-aware tokenizer.
  - `docs/NEXT-SESSION-CATALOG-CSP.md` — retrofit security-only (CSP meta + sessionStorage migration) into `docs/research-perspectives-catalog.html` and other dashboards.
