# Research: Annie Browser Agent — "Order Me Coffee"

**Session 355, March 22 2026**
**Status:** Plan ready, not yet implemented

## Origin

Rajesh asked: "Tell Annie — order me coffee." This triggered the question: how does Annie autonomously interact with websites on Rajesh's behalf?

## The Task: Blue Tokai Coffee Reorder

Rajesh has a Blue Tokai Coffee subscription (Shopify-based):
- **Account:** Rajesh Anneswamy, logged in at bluetokaicoffee.com
- **Subscription:** Silver Oak Cafe Blend, 24 × 1kg
- **Active order:** #596215 (Jul 4, 2025, ₹34,944)
- **Workflow:** Subscriptions page → click #596215 → Edit installments → find next empty slot → Update → pick earliest ship date → Save
- **Confirmation:** Annie reports "Silver Oak ordered for <date>" via Telegram

## Architecture Decision: LLM-Driven, Not Scripted

**Rejected approach:** Deterministic Playwright script with hardcoded selectors.
- Breaks when Blue Tokai updates UI
- Only works for one website
- Doesn't scale to "book a cab", "check my Swiggy order", etc.

**Chosen approach:** Give Annie browser tools, let Beast (120B) reason through the UI.
- Same pattern as Claude Code's `mcp__claude-in-chrome__*` tools
- Beast reads accessibility trees, decides what to click
- Adapts to UI changes automatically
- General-purpose: works for any website

## Infrastructure (Already Available on Titan)

| Component | Status | Details |
|-----------|--------|---------|
| Firefox | `/usr/bin/firefox` v148 (snap) | Fallback option |
| Playwright (Node) | v1.58.2 via `npx` | Already installed |
| Chromium (Playwright) | Cached v1200, v1208 | In `~/.cache/ms-playwright/` |
| Playwright (Python) | **NOT installed** | Needs `pip install playwright` in Annie's venv |
| Beast (120B) | Running on DGX Spark | Full thinking, 131K context |
| `execute_python` tool | Available in text chat | Can run arbitrary Python on Titan |

## Implementation Paths

### Path A: Quick Test via execute_python (30 min)

Annie already has `execute_python` on Beast. If Playwright Python is installed:

1. `pip install playwright` in Annie's venv on Titan (browsers already cached, no `playwright install` needed)
2. Send Annie: "Order me coffee from Blue Tokai"
3. Beast reasons → writes Playwright Python code → runs via `execute_python`
4. Gets page content back → reasons → writes next Playwright step → runs again
5. Each iteration: **think → write code → execute → read result → think again**
6. When she finds the date, asks Rajesh via Telegram for confirmation
7. On "Yes" → runs final Save step

**Pros:** Zero new tool code. Uses existing execute_python infrastructure.
**Cons:** Each step requires Beast to write Python code (slower). No persistent browser session between tool calls — each execute_python launches a new browser unless we manage state.

### Path B: Proper Browser Tools (2-3 hours)

Build 5 real tools backed by a persistent Playwright session:

| Tool | Function | What Beast sees |
|------|----------|-----------------|
| `browser_navigate` | Opens a URL | "Page loaded: Blue Tokai Account" |
| `browser_read` | Returns accessibility tree | Interactive elements with ref_ids |
| `browser_click` | Clicks an element by ref_id | "Clicked: Edit installments" |
| `browser_fill` | Types into a field | "Filled: ship date = March 28" |
| `browser_screenshot` | Returns screenshot | Visual verification |

**Persistent session:** Chromium profile at `~/.her-os/annie/browser/` — cookies survive restarts.

**Pros:** Clean tool interface. Persistent browser session. Beast reasons naturally ("I see a button labeled 'Update', let me click ref_42"). Reusable for any website.
**Cons:** More code to write. Need to manage browser lifecycle (startup, cleanup, crashes).

### Recommendation

**Path A first** as a live capability test — verify Beast can reason through Blue Tokai's UI. If it works, **Path B** becomes a polish step (faster, cleaner, persistent sessions).

## Login Flow (One-Time)

Blue Tokai uses Shopify auth. First-time login requires human interaction:

1. Annie opens login page on Titan's Chromium
2. Reads form → sends Telegram: "What's your email for Blue Tokai?"
3. Rajesh replies → Annie enters it
4. If OTP: "They sent a code to your phone/email. What is it?"
5. Rajesh replies → Annie enters it
6. Session established → **cookies persist in browser profile**
7. Next time: no login needed, goes straight to subscription page

The Telegram back-and-forth for OTP uses the existing bidirectional Telegram channel (same pattern as pulse inline buttons, but with free-text replies).

## Approval Gate (Mandatory)

Financial transactions are **T2 — requires explicit confirmation**:

```
Annie → Telegram: "I can order Silver Oak Cafe Blend 1kg×24,
                    ship date March 28. Should I confirm? (₹34,944)"
                    [Confirm Order] [Not now]

Rajesh → taps "Confirm Order"

Annie → clicks Save → Telegram: "Done. Silver Oak ordered for March 28."
```

Annie can navigate, read prices, fill forms — but the final "Submit Order" button always requires Rajesh's explicit "Yes".

## Full Flow: "Order Me Coffee"

```
Rajesh (voice): "Order me coffee"
    ↓
Annie (Nano, voice): "On it — let me check your Blue Tokai subscription."
    ↓
Spawns Beast agent (proactive lane, browser tools)
    ↓
Beast:
  1. navigate("https://bluetokaicoffee.com/pages/order-subscription")
  2. read_page() → finds subscription #596215
  3. click("#596215 link")
  4. read_page() → finds "Edit installments" button
  5. click("Edit installments")
  6. read_page() → finds next installment without a date
  7. click("Update") on that row
  8. read_page() → sees date picker, finds earliest date = March 28
  9. STOP → Telegram approval:
     "Silver Oak Cafe Blend 24×1kg, ship date March 28. Confirm?"
  10. Waits for Rajesh reply...
  11. On "Confirm" → fill(date, "March 28") → click("Save")
  12. Telegram: "Done. Silver Oak ordered for March 28."
```

## Broader Vision: General-Purpose Browser Agent

Once the 5 browser tools exist, Annie can handle any website task:

| Command | What Annie does |
|---------|----------------|
| "Order me coffee" | Blue Tokai subscription reorder |
| "Check my Amazon delivery" | Navigate to order tracking |
| "Renew my domain" | GoDaddy/Namecheap renewal flow |
| "Cancel that Zomato order" | Navigate to active orders → cancel |
| "Book a cab to the airport" | Ola/Uber booking (with approval) |
| "What's my electricity bill?" | BESCOM portal check |

This is **Dimension 2 (Voice OS)** from VISION.md — Annie as the operating system for Rajesh's digital life.

## Files to Create/Modify

| File | Action | What |
|------|--------|------|
| `services/annie-voice/browser_tools.py` | **CREATE** | 5 Playwright tools: navigate, read, click, fill, screenshot |
| `services/annie-voice/text_llm.py` | **MODIFY** | Register browser tools in CLAUDE_TOOLS |
| `services/annie-voice/bot.py` | **MODIFY** | Register browser tools for voice (if voice-triggered) |
| `services/telegram-bot/pending_handler.py` | **MODIFY** | Approval dialog for financial actions |

## Key Risks

1. **Playwright on aarch64** — Python playwright may have ARM64 issues. Node version works. Fallback: use subprocess to call Node Playwright.
2. **execute_python sandbox limits** — Resource limits in code_tools.py may block browser launch. May need exemption for browser tools.
3. **Shopify session expiry** — Sessions typically last weeks but can expire. Annie needs to detect "login required" and initiate the OTP flow.
4. **Blue Tokai UI complexity** — Shopify themes vary. The accessibility tree may be noisy. Beast needs enough context window to parse it (131K should be plenty).
5. **Beast availability** — If Beast is down, coffee ordering fails. Nano can't do this task. Need graceful fallback: "I can't reach my browser tools right now. I'll try again in 15 minutes."

## Next Steps

1. `pip install playwright` in Annie's venv on Titan
2. Test Path A: send Annie "order me coffee" via text chat, watch Beast reason through it
3. If Beast succeeds → build Path B (proper tools) for production use
4. Add to Annie's memory: "Rajesh orders Silver Oak Cafe Blend from Blue Tokai, subscription #596215"
