# Next Session: Docker Sandbox for Annie's execute_python

## Context

Session 366 sandboxed Claude CLI via Telegram (10 commits, 5/5 e2e tests pass). During verification, discovered Annie's `execute_python` tool has **zero sandboxing** — arbitrary Python code runs on Titan host as rajesh user with full filesystem + network access.

## The Vulnerability

`services/annie-voice/code_tools.py:_run_code_sync()` runs user-provided Python via `subprocess` with:
- Full filesystem access (can read `.env`, credentials, SSH keys)
- Full network access (can call any internal service, exfiltrate data)
- Only protections: 4 blocked browser imports, 30s timeout, 1GB memory
- Code runs as `rajesh` (UID 1000) with all permissions

**Attack vector:** Anyone on Telegram can send "run `import subprocess; subprocess.run(['cat', '/home/rajesh/.claude/.credentials.json'])`" and Annie will execute it.

## What to Do

Use `/make-plan` and `/planning-with-review` to plan Docker sandbox for Annie's `execute_python`, following the same pattern as the Claude CLI sandbox (Session 366).

### Key Files to Read
- `services/annie-voice/code_tools.py` — `_run_code_sync()` is the execution function
- `services/annie-voice/text_llm.py:685-706` — how execute_python is called
- `services/telegram-bot/docker_sandbox.py` — **REFERENCE**: the working Claude sandbox (copy this pattern)
- `services/telegram-bot/Dockerfile.claude-sandbox` — **REFERENCE**: container image
- `memory/project_claude_sandbox.md` — full architecture + lessons learned

### Lessons from Claude Sandbox (7 gotchas to avoid)
1. OAuth accessToken ≠ API key (different auth systems)
2. Claude CLI is Node.js — needs `/usr/bin/node` bind-mounted
3. Need both `~/.claude.json` AND `~/.claude/` mounted
4. Docker bridge on Titan has NO outbound NAT — use `--network=host`
5. `--dangerously-skip-permissions` hangs in non-interactive mode
6. Container sandbox user (UID 999) can't read host files (UID 1000) — use `--user $(id -u):$(id -g)`
7. `docker run` without `-i` flag doesn't connect stdin

### What's Different for Annie
- Annie's code runs Python directly (not Claude CLI) — simpler execution model
- Needs Python 3.12+ with numpy, subprocess, yt-dlp, matplotlib
- Output includes stdout, stderr, returncode, and optional image_base64
- Has a timeout (30s) and memory limit (1GB) already
- May need GPU access for some workloads (but execute_python doesn't use GPU)
- The `_scrub_secrets()` function already exists but only scrubs OUTPUT, not INPUT

### Scope
- Sandbox `execute_python` in Docker container
- Project mounted read-only (same as Claude sandbox)
- Network: `--network=host` (same limitation as Claude)
- Block `.env`/credential reads (prompt guard + code pattern blocking)
- Keep existing timeout/memory limits
- Preserve matplotlib image output capability
