# Research: Annie Phone — Expanded Vision

**Date:** 2026-03-31
**Status:** Ideas explored, not yet planned

---

## 1. Photo Editing (Remove Objects from iPhone Photos)

### Approaches

| Approach | How | Feasibility |
|----------|-----|:-----------:|
| **Pixel Magic Eraser** | Transfer photo to Pixel → Google Photos Magic Eraser via ADB | HIGH |
| **macOS + AI models** | SSH to MacBook → LaMa/SD inpainting in Python | HIGH |
| **iCloud access** | Playwright on iCloud web, or shared iCloud album | MEDIUM |
| **Telegram workflow** | Rajesh sends photo → Annie processes → sends back | EASIEST |

### Recommended Flow
1. Rajesh sends photo via Telegram (or Annie pulls from shared iCloud album)
2. Annie uses Pixel's Magic Eraser via ADB automation OR runs inpainting model on Titan GPU
3. Sends edited result back via Telegram

### iCloud Integration Options
- iCloud web (icloud.com) — automatable via Playwright
- Shared iCloud album — Annie gets notified of new photos
- AirDrop to Mac → Annie picks up from Mac filesystem via SSH

---

## 2. Annie Speaking Kannada (Daily Mom Check-in)

### Kannada TTS Options
- **Google Cloud TTS**: `kn-IN` voice, high quality, API cost
- **AI4Bharat IndicTTS**: Open-source, can run on Titan GPU
- **ElevenLabs**: Multilingual voice cloning (if Annie's voice is cloned, she could speak Kannada in her own voice)

### Kannada STT Options
- **Google Cloud Speech-to-Text**: `kn-IN`, best accuracy for Kannada
- **AI4Bharat IndicASR**: Open-source, self-hosted on Titan
- **Whisper**: Limited Kannada support, not recommended as primary

### Phone Call Mechanism
- Annie makes call via Pixel SIM: `adb shell am start -a android.intent.action.CALL -d tel:+91XXXXXXXXXX`
- Audio routing: Pixel speakerphone → captured by on-device app → streamed to Titan → Annie's pipeline → TTS audio streamed back → played on call
- Alternative: VoIP/SIP service (Twilio India, Exotel) for better audio quality and no ADB dependency
- Annie's scheduler triggers daily call at configured time (e.g., 9 AM)

### Conversation Flow
1. Annie calls mom at scheduled time
2. Greets in Kannada
3. Asks about health, meals, medications, activities
4. Reports summary to Rajesh via Telegram
5. Mom can ask Annie to do things (transfer money, order groceries, etc.)

---

## 3. Mom Calling Annie Directly

### How It Works
- Mom calls Annie's phone number (Airtel SIM on Pixel 9a)
- Annie auto-answers: `adb shell input keyevent KEYCODE_CALL` or custom Android app
- Audio routed to Annie's pipeline on Titan
- Annie speaks Kannada, processes requests

### Identity Verification (5-Gate Security)

```
Mom calls Annie's Pixel
  → Gate 1: Caller ID matches mom's registered number
  → Gate 2: Voice biometric match (ECAPA-TDNN embeddings, same tech as Rajesh's voice ID)
  → Gate 3: Amount under daily limit (e.g., ₹5,000)
  → Gate 4: Transfer ONLY to pre-registered bank accounts (mom's account hardcoded)
  → Gate 5: Telegram notification to Rajesh (post-transfer audit trail)
  → Execute UPI transfer
```

### Security Analysis

| Attack | Risk | Defense |
|--------|:----:|---------|
| Caller ID spoofing | **Easy** — cheap services exist | Gate 2 (voice biometrics) catches this |
| Voice deepfake | **Medium** — possible with samples, but real-time has latency artifacts | Gate 4 (destination whitelist) limits damage even if bypassed |
| Social engineering | **Low** — Annie follows strict rules | Gate 3+4 (amount cap + destination lock) |
| Combined spoof+deepfake | **Low** — hard to do in real-time | Even if all gates bypassed, money goes ONLY to mom's account |

### Voice Enrollment for Mom
- Same ECAPA-TDNN system already used for Rajesh
- Enroll mom's voice during first few calls (automatic enrollment with confirmation)
- Cosine similarity threshold — tune separately from Rajesh's (different voice characteristics)

---

## 4. Family Member Identity System

### Extending Beyond Mom
- Each family member gets: phone number + voice enrollment + allowed actions + spending limits
- Annie maintains a family registry with per-person permissions
- Could extend to: dad, siblings, spouse, children

### Per-Person Permission Matrix

| Person | Can call Annie | Money transfer | Transfer to | Daily limit | Language |
|--------|:-:|:-:|-------------|:-----------:|----------|
| Mom | Yes | Yes | Mom's account only | ₹5,000 | Kannada |
| Dad | Yes | Yes | Dad's account only | ₹5,000 | Kannada |
| Rajesh | Yes | Yes | Any (with approval) | Unlimited | English |

---

## 5. Open Questions

1. **Audio routing for phone calls**: How to bridge Pixel call audio to Titan's pipeline? Custom Android app vs VoIP service?
2. **Kannada LLM**: Which LLM handles Kannada conversation best? Nemotron? Claude? Need benchmarking.
3. **Mom's comfort level**: Will mom be comfortable talking to an AI? Gradual introduction strategy?
4. **iCloud authentication**: How to maintain persistent iCloud session for photo access?
5. **Magic Eraser automation**: Can Google Photos' eraser be triggered programmatically, or only via UI taps?

---

## Sources
- Annie's existing voice identification: `memory/reference_voice_identification.md` (ECAPA-TDNN embeddings)
- AI4Bharat IndicTTS/ASR: https://ai4bharat.iitm.ac.in/
- Google Cloud TTS Kannada: https://cloud.google.com/text-to-speech/docs/voices
- LaMa inpainting: https://github.com/advimman/lama
