# Next Session: Plan & Implement WhatsApp Automation Robustness

**Priority:** MEDIUM — current system works but fragile (7 selectors broke in session 11)
**Parallel session:** Can run independently of Audio Pipeline and Google Maps sessions
**Estimated effort:** Discussion (30 min) + Implementation (2-4 hours depending on approach)

## Background

Session 13 researched 6 WhatsApp automation approaches. Full research: `docs/RESEARCH-WHATSAPP-ROBUST-AUTOMATION.md`.

**Current pain:** Playwright DOM selectors for SENDING break when WhatsApp Web updates. Session 11 required fixing 7 selectors during deployment. Every WhatsApp Web update risks breaking Annie's WhatsApp channel.

**Current architecture (what's fragile):**
```
wa_browser.py  — Playwright singleton, persistent Chromium, anti-detection
wa_web.py      — 18 DOM selectors (primary + fallback), message polling
wa_sender.py   — clipboard paste for text, expect_file_chooser for images ← FRAGILE
agent.py       — poll loop, trigger pipeline, responder
```

**Key insight from research:** Every successful WhatsApp automation project avoids DOM selectors by hooking WhatsApp's internal webpack modules. Our approach (raw Playwright selectors) is the most fragile tier.

## Selector Stability Ranking (from community consensus)

1. **Internal webpack modules** (most stable) — whatsapp-web.js, WPPConnect
2. **`data-icon` attributes** — `span[data-icon="send"]`, change infrequently
3. **Structural IDs** — `#main`, `#pane-side`, reasonably stable
4. **`aria-label`** — changes with i18n
5. **`data-testid`** — **STRIPPED** as of 2026-04-07
6. **Class names** — most fragile, obfuscated

Our reading uses tier 2-3 (stable). Our sending uses tier 2-5 (fragile).

## Decision: Choose an Approach

### Option A: wa-js Injection (RECOMMENDED)

Inject WPPConnect's `wa-js` library into existing Playwright page. Call WhatsApp's internal JavaScript APIs instead of clicking DOM elements.

| Aspect | Detail |
|--------|--------|
| **Ban risk** | Same as current (low) — still a real browser |
| **New dependencies** | None (inject via CDN URL or bundled JS) |
| **Migration scope** | Only `wa_sender.py` changes |
| **Selector elimination** | Removes 5 fragile selectors (attach_button, attach_photos, media_preview, media_caption_box, media_send_button) |
| **Reading** | Unchanged (Playwright DOM polling stays) |

**Implementation plan:**
1. **Research wa-js API** — Document key functions:
   - `WPP.chat.sendTextMessage(chatId, text)` — send text
   - `WPP.chat.sendFileMessage(chatId, content, {type: 'image', caption: '...'})` — send image
   - `WPP.chat.find(chatId)` — find chat
   - `WPP.conn.isAuthenticated()` — session check
   - CDN: `https://cdn.jsdelivr.net/npm/@nichat/wa-js/dist/wppconnect-wa.js` (verify)

2. **Proof of concept on Panda:**
   ```python
   # In wa_browser.py after WhatsApp Web loads:
   await page.add_script_tag(url="https://cdn.jsdelivr.net/npm/@nichat/wa-js/dist/wppconnect-wa.js")
   await page.wait_for_function("typeof WPP !== 'undefined' && WPP.isReady")
   
   # In wa_sender.py — send text:
   await page.evaluate("""
     async (chatId, text) => {
       const chat = await WPP.chat.find(chatId);
       return await WPP.chat.sendTextMessage(chatId, text);
     }
   """, chat_jid, message_text)
   ```

3. **Migrate `wa_sender.send_message()`** — Replace clipboard paste + send button click with `WPP.chat.sendTextMessage()`

4. **Migrate `wa_sender.send_image()`** — Replace file-chooser flow with:
   ```python
   await page.evaluate("""
     async (chatId, base64Data, caption) => {
       return await WPP.chat.sendFileMessage(chatId, 
         `data:image/jpeg;base64,${base64Data}`,
         {type: 'image', caption: caption || ''}
       );
     }
   """, chat_jid, b64_image, caption)
   ```

5. **Remove fragile selectors** from `wa_web.py` SELECTORS dict:
   - `attach_button`, `attach_photos`, `media_preview`, `media_caption_box`, `media_send_button`
   - Keep: `compose_box`, `send_button` (still needed as fallback?)

6. **Keep reading selectors** — `.message-in`, `data-pre-plain-text`, `#pane-side`, `#main` stay

7. **Update tests** — Mock `page.evaluate` calls in `tests/test_wa_sender.py`

8. **Deploy to Panda + E2E test:**
   - Send text via WhatsApp
   - Send image via WhatsApp
   - Verify poll loop still reads messages

### Option B: WAHA NOWEB (Fallback)

Deploy WAHA Docker container on Panda with NOWEB engine (Baileys underneath, WebSocket protocol, no browser for sending).

| Aspect | Detail |
|--------|--------|
| **Ban risk** | Medium (protocol fingerprinting) — higher than current |
| **New dependencies** | Docker container (~200 MB, ARM64 community image) |
| **Migration scope** | `wa_sender.py` → httpx REST calls |
| **Selector elimination** | ALL selectors removed for sending |
| **Reading** | Keep Playwright OR switch to WAHA webhooks |

**Implementation plan:**
1. Deploy: `docker run -d -p 3000:3000 -e WHATSAPP_DEFAULT_ENGINE=NOWEB wecareapps/waha-arm64`
2. Scan QR: `GET http://localhost:3000/api/screenshot`
3. Send text: `POST http://localhost:3000/api/sendText` with `{chatId, text, session}`
4. Send image: `POST http://localhost:3000/api/sendImage` with `{chatId, file (base64), session}`
5. Replace `wa_sender.py` with httpx calls
6. Optionally: switch reading to WAHA webhooks (incoming message events)

### Option C: ADB Intent Hybrid (Zero-ban fallback)

Use Android ADB intents to share images, plus one u2 tap on Send.

| Aspect | Detail |
|--------|--------|
| **Ban risk** | Zero (real app, real UI) |
| **New dependencies** | None (already have u2 on Panda) |
| **Migration scope** | `wa_sender.py` → adb commands |
| **Limitation** | Still needs ONE u2 tap (Send button after intent) |

Only consider this if ban risk is the top concern.

## Discussion Questions for Rajesh

1. **Which option?** A (wa-js), B (WAHA), or C (ADB intent)?
2. **Is the reading side stable enough?** `.message-in` + `data-pre-plain-text` haven't broken in 6 sessions. Keep as-is?
3. **Timeline urgency?** WhatsApp Web updates ~every 2-4 weeks. Next update could break sending again.
4. **CDN dependency acceptable?** wa-js loaded from jsdelivr CDN means internet dependency at startup. Alternative: bundle the JS file locally.

## Files to Modify (Option A)

| File | Change |
|------|--------|
| `wa_browser.py` | Inject wa-js script after WhatsApp Web loads |
| `wa_sender.py` | Replace DOM automation with `page.evaluate()` → `WPP.chat.*` calls |
| `wa_web.py` | Remove 5 fragile sending selectors (keep reading selectors) |
| `tests/test_wa_sender.py` | Mock `page.evaluate` instead of DOM clicks |

## Verification

```bash
# On Panda
cd ~/workplace/her/her-os/services/whatsapp-agent

# 1. Tests pass
python3 -m pytest tests/ -v

# 2. Send text message
curl -X POST http://localhost:8780/v1/send \
  -H "Content-Type: application/json" \
  -d '{"chat": "Holliday", "text": "Testing wa-js migration"}'

# 3. Send image
curl -X POST http://localhost:8780/v1/send-image \
  -H "Content-Type: application/json" \
  -d '{"chat": "Holliday", "image_path": "/tmp/test.jpg", "caption": "wa-js test"}'

# 4. Poll loop still reads messages
# Send a message from phone to Holliday group → check agent logs
tail -f /tmp/whatsapp-agent.log | grep "new message"
```
