# Next Session: Speaker-Aware Barge-In — Layer 2 (Resume on Reject)

## What

When an intruder voice (podcast, TV, someone else) triggers a barge-in, Annie currently goes silent for nothing — the speaker gate rejects the transcription, but the TTS is already cancelled. This session implements "Resume on Reject": when the gate rejects an intruder, Annie resumes speaking from where she was interrupted. Zero latency cost to the enrolled speaker's happy path.

## Plan

`~/.claude/plans/velvety-percolating-plum.md`

**Read the plan first** — it has the full architecture, all adversarial review findings (13 issues, all addressed), state machine, pre-mortem, and implementation order.

## Key Design Decisions (from adversarial review)

1. **Side channel, NOT pipeline frames.** Both reviewers caught that `SpeakerRejectedFrame` flowing through the pipeline would be lost — Pipecat's `_start_interruption()` drains all process queues after interruption. The fix: `ResumeChannel` (asyncio.Queue) shared directly between SpeakerGateProcessor and ResponseTracker. Bypass the frame pipeline entirely.

2. **Fresh `threading.Event` per `run_tts` call.** The GPU synthesis thread checks `cancel.is_set()` at chunk boundaries. If a new run_tts clears the SAME event, the old still-running thread gets "un-cancelled." Fix: create a new Event per call. Old thread keeps reference to old (set) event.

3. **Sequential offset matching for spoken sentences.** `str.find()` fails on repeated phrases ("OK. OK. OK."). Fix: `startswith()` at the current offset position — each sentence is matched at the START of remaining text, not anywhere in full_text.

4. **Dual rate limit: max 3 resumes per response + min 2s between.** A per-response count alone misses the cadence — 3 resumes in 0.5s is a broken echo loop. A time-based floor prevents rapid-fire resume storms from TV audio.

5. **ResponseTracker background task for verdict wait.** `_handle_interruption()` starts `asyncio.create_task(self._wait_for_gate_verdict())`. The task awaits `channel.wait_for_verdict(timeout=10s)`. On rejection → resume. On acceptance → clear state. On timeout → clear state.

6. **Pipecat base `process_frame()` does NOT push frames.** Verified at `frame_processor.py:670-688`. It only does observer callbacks and handles system frames (InterruptionFrame, StartFrame, etc.). Safe to call `super().process_frame()` then push manually — same pattern as `speaker_gate.py`.

## Files to Modify

1. **`services/annie-voice/resume_channel.py`** (NEW) — GateVerdict, ResumeChannel, ResumeState
2. **`services/annie-voice/speaker_gate.py`** (MODIFY) — Add resume_channel param, signal on accept/reject
3. **`services/annie-voice/response_tracker.py`** (NEW) — ResponseTracker FrameProcessor
4. **`services/annie-voice/kokoro_tts.py`** (MODIFY) — Add SentenceTrackingKokoroTTS subclass
5. **`services/annie-voice/bot.py`** (MODIFY) — Wire ResumeState + ResumeChannel + components
6. **`services/annie-voice/tests/test_resume_channel.py`** (NEW) — Channel + state tests
7. **`services/annie-voice/tests/test_response_tracker.py`** (NEW) — Tracker + resume tests
8. **`services/annie-voice/tests/test_speaker_gate.py`** (MODIFY) — Channel signaling tests
9. **`services/annie-voice/tests/test_sentence_tracking_tts.py`** (NEW) — TTS wrapper tests

## Start Command

```
cat ~/.claude/plans/velvety-percolating-plum.md
```

Then implement the plan. All adversarial findings are already addressed in it.

## Verification

1. `cd services/annie-voice && python -m pytest tests/ -x -q` — all pass, 0 failures
2. Deploy: commit + push + `ssh titan 'cd ~/workplace/her/her-os && git pull'`
3. Restart: `./stop.sh annie && ./start.sh annie`
4. **E2E Intruder Resume**: `https://voice.her-os.app` → Ask multi-sentence question → play podcast while Annie speaks → verify Annie pauses briefly then resumes
5. **E2E Enrolled Barge-in**: Speak while Annie talks → verify normal barge-in (no resume)
6. **E2E Cascade**: Continuous podcast → verify max 3 resumes then Annie stays silent
7. All 2750+ existing tests pass
