# Next Session: Phone Thinking Sounds

## What
Add "thinking sounds" to Annie's phone calls. When Annie processes (STT → LLM → tools → TTS), the caller currently hears dead silence for 2-5 seconds. This adds a short "Hmm..." verbal cue during the gap so the caller knows Annie is still there.

## Plan
Read the approved plan at `~/.claude/plans/precious-drifting-wirth.md`.
It has the full implementation, all 15 adversarial review findings (2 CRITICAL, 7 HIGH, 4 MEDIUM), and design decisions.

## Key Design Decisions (from adversarial review)

1. **Only "Hmm..." variants** — NOT sentence fillers like "Let me think..." which break when cut mid-word on fast LLM responses (~300ms TTFT). "Hmm..." sounds natural at any truncation point.
2. **ThinkingCueHandle dataclass** with `stop` + `done` Events — proper synchronization. The original 50ms `asyncio.sleep` was a race condition, not a barrier. `done.wait()` guarantees pw-play is dead before sentence 1 plays.
3. **try/finally cleanup** in conversation loop — prevents orphaned pw-play zombies on cancel_event or pipeline exception.
4. **Echo drain extended** — `(tts_duration_s + cue_duration) * _POST_ECHO_FACTOR` instead of just `tts_duration_s`. Prevents ghost turns from cue echo.
5. **thinking_cue threaded to ALL _playback_worker rounds** — prevents compaction-retry from losing the handle. `Event.set()` is idempotent, so round 2+ workers see it already set and skip.
6. **No barge-in during thinking cue** — cues are <1s, existing bargein_queue drain at line 668 handles the transition. Adding barge-in to a 1s cue adds complexity for negligible UX benefit.
7. **THINKING_CUES_ENABLED kill-switch** — env var in project .env on Titan.

## Files to Modify (in order)

1. `services/annie-voice/phone_loop.py` — Add `ThinkingCueHandle`, `_play_thinking_cue()`, `_load_thinking_cues()`, modify `_playback_worker` + `_run_streaming_pipeline`, wire into conversation loop (~80 lines new, ~20 modified)
2. `services/annie-voice/generate_thinking_cues.py` (NEW) — One-time generation script (~50 lines)
3. `services/annie-voice/tests/test_thinking_cues.py` (NEW) — Unit + integration tests (~120 lines)
4. `services/annie-voice/.env.example` — Add THINKING_CUES_ENABLED, THINKING_CUES_DIR

## Start Command
```
cat ~/.claude/plans/precious-drifting-wirth.md
```
Then implement the plan. All adversarial findings are already addressed in it. Follow the 10 implementation steps in order.

## Verification
1. `python generate_thinking_cues.py` → 3 WAVs created
2. `file ~/.her-os/annie/thinking-cues/*.wav` → "16000 Hz, Mono"
3. `python -m pytest tests/test_thinking_cues.py -v` → all green
4. `python -m pytest tests/ -q` → existing tests pass
5. Call Annie, ask "What's the weather?" → hear "Hmm..." then response
6. Call Annie, say "Hi" → fast response, cue cleanly cut or skipped
7. Barge-in during response → still works
8. `THINKING_CUES_ENABLED=false` → no cue (kill-switch)
9. Delete cues dir → no crash (graceful fallback)