# Next Session: Fix All Sunday Demos (Empirical Tuning) — V2

> **Supersedes** `docs/NEXT-SESSION-SUNDAY-DEMO-FIX.md`. The V1 handoff was correct in
> spirit but had two structural problems that adversarial review caught:
> (1) it recommended a "WB test before calibration" step that would have invisibly
> contaminated the calibration phase via V4L2 device-state persistence, and
> (2) its "drop demos from menu" approach would have broken hard-coupled tests and
> blocked the commit. Both are fixed here.

## What
Tune the 8 TurboPi Sunday demos so **6/8 work reliably** before guests arrive
**Sunday April 12, 2026**. The infrastructure (DemoRunner, telegram inline
keyboard, headless runner, TrafficCop.py) is already deployed (session 50,
commits `1c8d2f5` + `47fb0fe`). What remains is empirical: calibrate LAB
ranges for red/blue/green balls, smoke-test each demo, drop broken demos
from the Sunday menu via a runtime filter.

This is **NOT a build session.** It's a tuning + testing session with a
**human-in-the-loop holding balls in front of the car.** Time budget: **3–4 hours**.

## Plan
**Read this first:** [`~/.claude/plans/polymorphic-riding-turtle.md`](file:///home/rajesh/.claude/plans/polymorphic-riding-turtle.md)

The plan has the full implementation, the adversarial-review findings, and
all the design decisions. **Don't skip it.** The most important part is the
phase ordering, which was changed from the V1 handoff after a critical
V4L2 driver-state issue was caught in review (`Stage 2 — feature-dev:code-architect`).

## Key Design Decisions (from adversarial review — DO NOT REVERT)

1. **Calibration first, NEVER touch white balance.** The V1 handoff recommended a
   "WB test before calibration" step. **DO NOT DO THIS.** V4L2 controls
   (`CAP_PROP_AUTO_WB`, `CAP_PROP_WB_TEMPERATURE`) on UVC cameras persist on the
   device across `cap.release()` and `cap.open()` until explicitly reset or until
   power cycle. A single read-only WB test pollutes the device for the rest of the
   session, and `lab_calibrate_pi.py` (verified at lines 48-65) and Hiwonder's
   `Camera.py` both rely on the default device state and do NOT reset WB. See
   plan §11 for the full driver-level explanation. To do the WB fix safely
   requires a coordinated three-way change to lab_calibrate_pi.py, FrameGrabber,
   AND Hiwonder's Camera.py — not feasible in 3-4 hours, deferred to a post-Sunday
   session.

2. **Drop broken demos via `MENU_DISABLED_DEMOS` runtime filter, NOT by editing
   `DEMO_BUTTONS`.** The naïve approach (commenting out buttons in
   `services/telegram-bot/car_demo_handler.py`) breaks **two strict tests** in
   `services/telegram-bot/tests/test_car_demo_handler.py:118` and `:128`
   (`test_buttons_cover_all_demos` uses strict equality between
   `DEMO_BUTTONS` keys and `DEMO_TO_SCRIPT.keys() | {"stop"}`;
   `test_chase_ball_has_three_color_variants` asserts exactly 3 chase_ball
   callbacks). Pre-commit pytest would block the deploy. The plan adds a single
   new constant `MENU_DISABLED_DEMOS: frozenset[str] = frozenset()` that filters
   at keyboard-build time. **3-line edit, zero impact on existing tests.** See
   plan §3d for the exact change and the new defensive test.

3. **Hailo safety daemon is OFF during demos.** `services/turbopi-server/main.py:349`
   short-circuits `_safety_estop_callback` when `_demo_runner.mode` is True.
   This is **deliberate** (UART contention — pyserial 3.5 TIOCEXCL means the
   demo subprocess and Hailo can't both write to `/dev/ttyAMA0`) but means there
   is **NO automatic hand-protection** during Chase Ball / Obstacle Dodge / Traffic
   Cop. **Brief guests verbally** before letting them touch the keyboard. See
   plan §10 for the verbal briefing script and operator checklist.

4. **Wall-clock self-check.** Pre-mortem #10 (HIGH likelihood) says "the model
   rabbit-holes on a single broken demo." The plan defends against this with
   explicit elapsed-time checks at every phase boundary:
   - Hard-stop session at 3.5 h elapsed (`elapsed > 12600`)
   - Drop any demo that burns >20 min (`single-demo elapsed > 1200`)
   - Phase caps: 60 min Phase 1, 30 min Phase 2, 90 min Phase 3
   See plan Phase 0 wall-clock protocol.

5. **Pre-flight 0.6 unconditionally rsyncs** the calibration script. The Pi
   copy at `/tmp/lab_calibrate_pi.py` may go stale relative to the repo. Always
   overwrite, never "scp if missing."

6. **Pre-flight 0.11 is end-to-end.** Drop a menu file → keyboard renders →
   tap Look Around → observe servo movement on `/health`. This proves the
   entire pipeline (Annie → telegram bot env vars → Pi auth → motor) is
   healthy before any motor work. The V1 "verify telegram bot process exists"
   check was a false sense of security.

7. **State-machine guard polls 20 iterations** (not 8). Video reclaim can take
   10–15 s — the V1 handoff's 8 s window contradicts this internally.

8. **PID flip verification uses `grep -nF`** (literal-string match), not regex
   `.*`. Loose patterns hide zero-match seds where the file is unchanged but
   the verification appears to succeed.

9. **Sunday-morning recalibration includes explicit `systemctl stop`** between
   each color. The calibration script needs exclusive access to `/dev/video0`,
   which means turbopi-server must be stopped first. V1's Phase 5.4 was missing
   this step.

## Files Already Modified by Session 50 (DO NOT EDIT in this session)
- `services/turbopi-server/demo_runner.py` — `asyncio.current_task()` self-cancel guard fixed
- `services/turbopi-server/main.py` — _demo_mode guards, /health short-circuit, deadman skip
- `services/turbopi-server/pi-files/_headless_runner.py` — cv2 monkey-patch + board reuse
- `services/turbopi-server/pi-files/Functions/TrafficCop.py` — new Avoidance + LineFollower graft
- `services/turbopi-server/frame_grabber.py` — **DO NOT PATCH** for WB (see plan §11)

## Files to Modify in This Session (only if needed)
- `/home/pi/TurboPi/lab_config.yaml` — via `scripts/lab_calibrate_pi.py` (Phase 1, mandatory)
- `/home/pi/TurboPi/Functions/ColorTracking.py` — only if PID sign flip needed (§3c, fallback)
- `services/telegram-bot/car_demo_handler.py` — only if reduced menu needed, ONLY via the new `MENU_DISABLED_DEMOS` constant (§3d, fallback)
- `services/telegram-bot/tests/test_car_demo_handler.py` — only if §3d is triggered, add the new `test_menu_disabled_demos_filter` defensive test

## Pre-Session Environment Check (CRITICAL — V1 missed this)

**The Pi was offline at the start of the planning session** because Rajesh had switched it
off to save battery between sessions. **Always re-check Pi reachability** before the
session — power it on at least 5 minutes before to let it boot and rejoin the WiFi.

**Before starting the next session, verify from the target host:**
```bash
ssh pi hostname                       # expect: pi-car
ssh pi "systemctl is-active turbopi-server"   # expect: active
curl -s http://192.168.68.61:8080/health | jq '.status,.demo_mode,.phase,.throttled'
```

If any of these fail, fix connectivity FIRST. The plan's Phase 0 will refuse
to proceed and you'll waste a session restart on a 2-minute network check.

## Phase 0 Pre-Flight Already Completed (Session 51 at 07:32 IST + Session 52 at 07:39 IST)

**Session 51 ran 0.1–0.9** (07:32 IST, all PASS).
**Session 52 re-verified 0.1–0.3 + 0.6 + 0.8, and newly ran 0.10 + 0.11a** (07:39–07:43 IST).

Current status:
- **0.1–0.10: ALL PASS** (fresh at 07:39 IST).
- **0.11a** (drop menu trigger file on Titan): PASS — `car-demo-menu-effd9b8b.json` dropped
  into `~/.her-os/annie/task_results/`, `pending_handler.task_result_notification_loop`
  consumed it within its 60 s poll interval, `sendMessage` logged at 07:41:20 in
  `/tmp/telegram-bot.log`, file deleted.
- **0.11b** (Rajesh taps 🎯 Look Around, observes servo movement): **NOT YET VERIFIED —
  blocking gate for Phase 1.**

**New facts vs session 51:**
- `throttled=0` (was `0x50000` in session 50 — under-voltage condition resolved).
- `uptime_s=576` at 07:39 → Pi cold-booted ~10 min before the session.
- `sonar_healthy=true`, `distance_mm=772`.
- 43-char bearer token extracted cleanly with sed-based parse (raw `grep | cut` chain
  included the `ROBOT_API_TOKEN=` prefix and returned 59 chars → Unauthorized. Use
  `sed -n 's/.*ROBOT_API_TOKEN=\([^"[:space:]]*\).*/\1/p'` instead.)
- Pre-flight 0.6 unconditional rsync sent 82 bytes (no content change; session 51's
  Pi copy was already in sync). Plan §ARCH-9 defense worked as designed.

**Plan bugs discovered during session 52 execution — MUST fix before Phase 3:**

1. **Plan §5 state-machine guard loop (lines 400-411) has an incorrect `jq -e`
   predicate.** It asks for `.phase=="idle"` and `.frame_grabber_healthy==true`, but
   the real `/health` contract has no `frame_grabber_healthy` field at all and uses
   `demo_phase` not `phase`. As written, the loop would NEVER succeed → "STATE MACHINE
   STUCK" would fire every iteration → spurious `systemctl restart` between every
   Phase 3 demo test. **Correct form:**
   ```bash
   if curl -s http://192.168.68.61:8080/health | jq -e '.demo_mode==false and .demo_phase=="idle" and .safety_daemon_healthy==true' > /dev/null 2>&1; then
     ready=1; break
   fi
   ```
   The plan file itself (`~/.claude/plans/polymorphic-riding-turtle.md`) was not
   edited — this handoff is the correction source of truth.

2. **API contract drift `/health` vs `/demo/status`.** `/health` returns
   `demo_phase: "idle"`, `demo_mode: false`. `/demo/status` returns `phase: "idle"`,
   `running: false`. Same semantic value, different keys + different bool shape. Not
   urgent for Sunday, but post-Sunday cleanup: pick one contract and enforce.

---

### Original Phase 0 table (session 51 results kept for history)

**Steps 0.1 through 0.9 already PASSED.** The next session can skip them and start
at **0.10 (token check + curl /demo/status)** and **0.11 (end-to-end telegram tap)**,
then go directly to Phase 1 calibration.

**Recorded results — verify they still apply by running 0.1/0.2/0.3 first:**

| Check | Result | Notes |
|---|---|---|
| 0.1 SSH reachable | ✅ `pi-car` | Pi 5 hostname |
| 0.2 turbopi-server active | ✅ `active` | systemd unit healthy |
| 0.3 `/health` clean | ✅ `status=ok, demo_mode=false, phase=idle, throttled=0, safety_daemon_healthy=true, lidar_healthy=true, sonar_healthy=true, distance_mm=766, pan=1500, tilt=1500, battery_v=0` | `battery_v=0` is **expected** — SDK's `get_battery()` returns None on this TurboPi variant; watch `throttled` instead |
| 0.4 lab_config.yaml exists | ✅ | Plus full content captured below |
| 0.5 yaml backup | ✅ | Saved to `/home/pi/TurboPi/lab_config.yaml.1775872932.bak` — restore via `cp` if calibration goes wrong |
| 0.6 calibration script rsync'd | ✅ | `/tmp/lab_calibrate_pi.py` (176 lines, executable). Re-rsync at start of next session in case of repo edits. |
| 0.7 video devices | ✅ | **19 `/dev/video*` nodes** (`video0,1,19-35`). This is normal for Pi 5 + UVC cameras + Hailo HAT — each UVC device exposes multiple v4l2 nodes. `cv2.VideoCapture(0)` still picks `/dev/video0` correctly. |
| 0.8 orphan demo pid | ✅ `CLEAN` | No leftover subprocess |
| 0.9 ColorTracking.py | ✅ 327 lines | At `/home/pi/TurboPi/Functions/ColorTracking.py` |
| 0.10 token + auth curl | NOT YET RUN | Run from next session: `TOKEN=$(ssh pi 'sudo cat /etc/systemd/system/turbopi-server.service.d/override.conf' \| grep ROBOT_API_TOKEN \| cut -d= -f2)` then `curl -sH "Authorization: Bearer $TOKEN" http://192.168.68.61:8080/demo/status` |
| 0.11 end-to-end telegram tap | NOT YET RUN | Drop menu file, watch keyboard render, tap Look Around, observe servo move |

### Current `/home/pi/TurboPi/lab_config.yaml` content (BEFORE recalibration)

```yaml
black:        # left untouched — calibration script only handles red/blue/green
  max: [115, 135, 135]
  min: [0, 0, 0]
blue:         # BROKEN: a.max=185 matches red/magenta, not blue
  max: [130, 185, 120]
  min: [0, 126, 0]
green:        # BROKEN: b.max=207 matches yellow-green, not pure green
  max: [252, 109, 207]
  min: [0, 48, 94]
red:          # WIDE: a.min=173 is permissive; never verified against real ball
  max: [255, 255, 255]
  min: [0, 173, 130]
white:        # left untouched
  max: [255, 255, 255]
  min: [193, 0, 0]
```

This confirms the V1 handoff's diagnosis: blue and green LAB ranges are clearly wrong
(blue.max[a]=185 is in the red/magenta region; green.max[b]=207 is yellow-green, not pure green).
Red.min[a]=173 is technically valid but wide — under a red-tinted camera it likely matches
walls/skin/shadows. **All three need recalibration in Phase 1.**

The `black` and `white` entries are not used by Chase Ball (the calibration script's
argparse only accepts `red|blue|green`). Leave them alone.

### Updated next-session start command

```bash
# 1. Re-verify Pi is still reachable (Rajesh may have switched it off again to save battery)
ssh pi hostname && ssh pi "systemctl is-active turbopi-server" && curl -s http://192.168.68.61:8080/health | jq '.demo_mode,.phase,.throttled'

# 2. Re-rsync calibration script (defense against stale Pi copy)
rsync -av /home/rajesh/workplace/her/her-os/scripts/lab_calibrate_pi.py pi:/tmp/lab_calibrate_pi.py

# 3. Set wall-clock baseline for execution session
T0=$(date +%s)
echo "Execution T0=$T0 ($(date '+%H:%M:%S %Z'))"

# 4. Run pre-flight 0.10 (token + auth) and 0.11 (telegram tap), then go to Phase 1.
#    Phase 0 steps 0.4–0.9 are already verified and recorded above.
```

## Start Command
```bash
# 1. Read the plan in full (it's the source of truth)
cat ~/.claude/plans/polymorphic-riding-turtle.md

# 2. Set the wall-clock baseline
T0=$(date +%s)
echo "Session start: $(date), T0=$T0"

# 3. Run Phase 0 pre-flight checks (read-only, must all pass before destructive work)
#    See plan §4 Phase 0 for the 11-check list
```

**Stop and ask Rajesh** if any pre-flight check fails. Do not attempt to fix
without confirmation. The plan is structured around "fail loud, ask early"
because every prior session that burned time started with one silent
assumption violation that nobody caught until 90 min in.

## Verification (end-of-session exit criteria)
1. `curl -s http://192.168.68.61:8080/health | jq '.'` reports all daemons
   healthy, `demo_mode=false`, `phase=idle`, `throttled=0` at rest.
2. `/home/pi/TurboPi/lab_config.yaml` has narrow ranges (spread ≤ 50 per
   channel on a/b axes) for **all three** colors, OR any wide-range color
   is in `MENU_DISABLED_DEMOS`.
3. Telegram keyboard contains only verified-working demos.
4. Each verified demo passes start → observe → stop → idle cycle.
5. ≥6 of 8 original demos work, OR menu is reduced to working subset with
   documentation of which were dropped and why.
6. Sunday-morning handoff (plan §8) is written to
   `docs/NEXT-SESSION-SUNDAY-DEMO-MORNING.md`.

## Sunday-Morning Phase (separate ~20-min session)
Plan Phase 4 runs Sunday morning before guests arrive. It includes:
- Pi power + WiFi check
- Battery + throttle check
- **Recalibrate all three colors in actual demo-room lighting** (lighting
  can shift dramatically from Saturday evening to Sunday morning)
- 3-demo smoke test (Red Ball, Look Around, Obstacle Dodge)
- Menu re-spawn

Allocate 20 minutes Sunday morning. If anything fails on the smoke test,
drop the affected demo via `MENU_DISABLED_DEMOS` and re-deploy.

## Adversarial Review Trail
This session's plan went through `planning-with-review` Stages 0-8:
- **Stage 0** — Prior lessons from sessions 38, 41, 46, 47, 50 + V1 handoff
- **Stage 1** — Initial draft with 5 phases
- **Stage 1C** — Pre-mortem (12 failure scenarios)
- **Stage 2** — Architecture destruction review (`feature-dev:code-architect`):
  10 findings — 2 CRITICAL, 4 HIGH, 4 MEDIUM. All implemented.
- **Stage 3** — Code-quality destruction review (`feature-dev:code-reviewer`):
  7 findings — 3 CRITICAL, 3 IMPORTANT, 1 LOW. All implemented.
- **Stage 6** — Feedback resolution (plan §12 has the full table)
- **Stage 8** — This document

The plan went from 6 phases to 5 phases (one entire WB phase removed due to
ARCH-2/ARCH-5), added 4 new sections (Sunday safety briefing, deferred work,
review resolution, this handoff), and added 3 enforcement mechanisms
(wall-clock self-check, runtime menu filter, end-to-end pre-flight tap).

## Post-Session
After implementation + deploy + verification, run the post-implementation
retrospective from `planning-with-review` Stage Post: tag each adversarial
finding as HIT / MISS / PARTIAL / N/A and update memory if patterns emerge.
Sessions 38, 41, 46, 47, 50 have shown a clear pattern (V4L2 device-state,
pyserial TIOCEXCL, asyncio lock binding, env var loss across hosts) that
deserves promotion to CLAUDE.md once 3+ HITs accumulate on the same gotcha.