# Next Session: Streaming Nav — Close-Range Tuning + Telegram E2E + Systemd + Pipeline VLM

## What

Implement 4 fixes for the streaming vision navigation system deployed in session 79:
1. **Close-range over-rotation fix** (HIGH) — cap consecutive centering turns + forward/backward nudge
2. **Telegram E2E** (MEDIUM) — small Annie-side fix + full chain verification
3. **Panda systemd unit** (MEDIUM) — replace nohup with proper systemd user service
4. **Pipeline VLM with driving** (LOW) — overlap Pi execution with next VLM for forward commands

Session 79 E2E showed: 3/4 runs found the red ball (14 and 65 cycles), but 1 run over-rotated 1200°+ at close range when VLM confused reddish furniture with the ball. Also: Telegram E2E untested, panda-nav runs as fragile nohup, and forward cycle time is bottlenecked by Pi blocking.

## Plan

**Path:** `~/.claude/plans/effervescent-foraging-beaver.md`

Read the plan first — it has the full implementation with all adversarial review findings addressed. 4 CRITICAL + 7 HIGH + 5 MEDIUM findings were found and fixed.

## Key Design Decisions (from adversarial review)

1. **Backward nudge fallback** — when centering counter triggers but lidar blocks forward (ball at 15-20cm triggers safe_forward=False), use backward 0.3s instead. Requires adding "backward" to `_execute_on_pi()`.
2. **Reset counter on ALL non-centering** — not just approaching/goal_reached. Prevents false nudge after a search turn interrupts a centering sequence.
3. **Update `_last_target_cycle` after nudge** — so hysteresis doesn't immediately trigger a search after the nudge fires.
4. **systemd user scope** — `WantedBy=default.target` not `multi-user.target`. Consistent `--user` in start.sh/stop.sh. Requires `loginctl enable-linger rajesh`.
5. **No ExecStartPre** — removed the `curl llama-server/health` pre-check. Server self-heals (returns 503, Annie falls back). Prevents restart loop during 30-120s model loading.
6. **Pipeline: cancel _pending_pi_task in _stop_internal BEFORE closing _pi_client** — prevents race where shutdown closes the httpx client while a fire-and-forget forward task is still using it.
7. **Stall detection grace period** — Annie's `_navigate_streaming()` skips stall detection until first cycle completes (was falsely declaring stall at startup).
8. **Static NAV_TOKEN** — dynamic per-session generation replaced with static token in EnvironmentFile. Annie must be restarted after token change (singleton client caches token at import time).

## Files to Modify

### Item 1 (over-rotation fix):
1. `services/panda_nav/server.py` — add `FWD_NUDGE`, `CENTERING_NUDGE_THRESHOLD` constants; add `_consecutive_centering` to `__init__`/`start()`; add backward to `_execute_on_pi`; add nudge logic in `_run_loop` else branch
2. `services/panda_nav/tests/test_nav_controller.py` — 9 new tests

### Item 2 (Telegram E2E):
3. `services/annie-voice/robot_tools.py` — stall detection grace period in `_navigate_streaming()`

### Item 3 (systemd unit):
4. `config/systemd/panda-nav.service` — NEW file (user service)
5. `stop.sh` — add panda-nav to `stop_panda_services()`
6. `start.sh` — update `start_panda_nav()` to prefer systemd

### Item 4 (pipeline VLM):
7. `services/panda_nav/server.py` — add `_pending_pi_task`; modify `_stop_internal()`, `_run_loop` execution section, `finally` block
8. `services/panda_nav/tests/test_nav_controller.py` — 6 pipeline tests

## Start Command

```
cat ~/.claude/plans/effervescent-foraging-beaver.md
```

Then implement in order: Item 1 → Item 2 → Item 3 → Item 4. All adversarial findings are already addressed in the plan.

## Verification

1. Run `cd services/panda_nav && python -m pytest tests/ -v` — expect 55+ passing (46 existing + 9 Item 1 + 6 Item 4)
2. Run `cd services/annie-voice && python -m pytest tests/ -v` — expect 593+ passing
3. **Item 1 deploy:** commit → push → ssh panda → git pull → restart panda-nav → E2E with ball at 20cm
4. **Item 2 E2E:** Telegram "find the red ball" → verify goal_reached response
5. **Item 3 deploy:** follow 8-step deployment checklist in plan (generate token, install service, sync to Annie, verify auth)
6. **Item 4 deploy:** commit → push → restart → compare forward cycle times (expect ~3× faster)

## Known Deferred Items

- **NAV_TOKEN singleton caching in Annie** — `_nav_decide_client` caches token at creation. Requires Annie restart after token change. Future: add client invalidation.
- **Stall detection cycle-0 edge case** — mitigated by grace period but fundamental design is polling-based. Future: refactor to event-driven stall detection.
