# Next Session: Chatterbox Turbo English TTS + Language-Aware Dual-TTS Routing

## What

Deploy Chatterbox Turbo as Annie's English TTS (MIT license, beats ElevenLabs, voice cloning + emotion control) alongside the existing IndicF5 for Kannada. Build a language-aware `AutoTTSService` that routes per-utterance to the correct TTS backend based on detected language. Keep Kokoro as a fallback. This gives Annie Samantha's voice in both English AND Kannada.

## Plan

`~/.claude/plans/functional-wobbling-bentley.md`

**Read the plan first** — it has the full implementation, all 20 adversarial review findings (6 CRITICAL, 8 HIGH, 6 MEDIUM — all addressed), state machine diagram, pre-mortem failure analysis, AutoTTSService class design, TTSBackend protocol design, and VRAM budget calculations.

## Key Design Decisions (from adversarial review)

1. **VRAM hard gate in Phase 0** — Measure Chatterbox actual VRAM on Panda. If >5 GB, ABORT entire plan and keep Kokoro. Budget must leave >3.6 GB free on Panda (16 GB total).

2. **Perth watermark phone test in Phase 1, NOT Phase 8** — Test Chatterbox audio through BT HFP codec (16kHz) BEFORE building any server infrastructure. If Perth watermark produces audible artifacts on phone calls, abort Chatterbox.

3. **AutoTTSService wrapper, NOT pipeline reconstruction** — Pipecat pipeline has a single TTS slot. `AutoTTSService(TTSService)` holds both `ChatterboxTTSService` and `IndicF5TTSService` internally, delegates `run_tts()` per-utterance based on `detect_tts_backend(text)`. Proxies `_handle_interruption` to whichever sub-service is currently active via `_active` field.

4. **Kannada fallback = ErrorFrame, NOT English TTS** — If IndicF5 is unreachable, Kannada text yields `ErrorFrame`. Do NOT pipe Kannada through Chatterbox (produces garbage/silence). Phone path: pre-recorded "having trouble" message.

5. **TTSBackend protocol for phone_audio.py** — Refactors the God Object `PhoneTTS` to use a `TTSBackend` protocol. `KokoroBackend`, `IndicF5Backend`, `ChatterboxBackend`, `AutoRoutingBackend` are separate classes. Factory function validates backend name.

6. **Eager model pre-load in chatterbox_server.py** — `@app.on_event("startup")` MUST call `_get_model()` eagerly (same as IndicF5's `_validate_startup`). Prevents concurrent model load OOM when both servers start on Panda.

7. **BF16 autocast for Chatterbox IF Vocos vocoder** — Chatterbox may use Vocos (same as IndicF5). Verify in Phase 0.7. If yes, MUST use `torch.autocast("cuda", dtype=torch.bfloat16)`. FP16 breaks Vocos (ComplexHalf noise).

8. **Hindi routes to Chatterbox (English), NOT IndicF5** — Language router only checks Kannada Unicode range (U+0C80-U+0CFF). Hindi/Devanagari → Chatterbox. Revisit when Hindi reference clip is available.

9. **Kokoro stays runnable** as rollback. `TTS_BACKEND=kokoro` remains valid. Don't remove Kokoro from start.sh options.

10. **Fish Speech S2 bakeoff deferred** to a separate session. Fabricated API, unknown Kannada quality, VRAM cliff, and CC-BY-NC-SA license make it too risky to bundle.

11. **Sequential server startup** in start.sh: Chatterbox first → wait for `model_loaded: true` → then IndicF5. Prevents concurrent model load OOM.

12. **phone_audio.py uses synchronous `requests`** (NOT async httpx). ChatterboxBackend must follow this pattern.

## Files to Create

1. `services/annie-voice/chatterbox_server.py` — FastAPI TTS server (port 8772, auth, semaphore, eager pre-load)
2. `services/annie-voice/chatterbox_tts.py` — Pipecat TTSService HTTP bridge (barge-in, chunked audio)
3. `services/annie-voice/language_router.py` — `detect_tts_backend(text)` Unicode-based routing
4. `services/annie-voice/auto_tts.py` — `AutoTTSService` multiplexing wrapper
5. `services/annie-voice/tts_backends.py` — `TTSBackend` protocol + `KokoroBackend`, `IndicF5Backend`, `ChatterboxBackend`, `AutoRoutingBackend`, `create_backend()` factory
6. `tests/test_chatterbox_server.py` (~12 tests)
7. `tests/test_chatterbox_tts.py` (~8 tests)
8. `tests/test_language_router.py` (~10 tests)
9. `tests/test_auto_tts.py` (~8 tests)
10. `tests/test_tts_backends.py` (~10 tests)

## Files to Modify

1. `services/annie-voice/bot.py` — Add chatterbox/auto backends, import AutoTTSService
2. `services/annie-voice/phone_audio.py` — Refactor to TTSBackend protocol
3. `services/annie-voice/phone_loop.py` — Use create_backend factory
4. `start.sh` — Add start/stop_chatterbox_server, sequential startup
5. `docs/RESOURCE-REGISTRY.md` — Panda VRAM section update

## Start Command

```
cat ~/.claude/plans/functional-wobbling-bentley.md
```

Then implement the plan. **Start with Phase 0 (VRAM verification) — it's a hard gate.** If Chatterbox VRAM > 5 GB on Panda, abort and keep Kokoro.

All 20 adversarial findings are already addressed in the plan.

## Verification

1. Phase 0 gate: Chatterbox installs, VRAM < 5 GB, output format compatible
2. Phase 1 gate: Samantha identity preserved, RTF < 0.5, no Perth+HFP artifacts
3. Tests: `cd services/annie-voice && python -m pytest tests/ -x -q` — all pass
4. Phone E2E: Call Annie in English → Chatterbox Samantha voice
5. Phone E2E: Call Annie in Kannada → IndicF5 Samantha voice
6. Barge-in: Interrupt mid-sentence both languages → clean cancel
7. Security: Unauthenticated POST to Panda:8772 → 401
8. Fallback: Stop Chatterbox → Kokoro fallback. Stop IndicF5 → ErrorFrame for Kannada.
9. Rollback: TTS_BACKEND=kokoro → pre-Chatterbox state
10. RESOURCE-REGISTRY.md matches nvidia-smi on Panda
