A Dr. Nova Brooks Guided Tour

Anatomy of an
AI Agent

What the fastest-growing open-source project in history teaches us about building a personal AI — and every architectural decision behind her-os.

Based on OpenClaw analysis + Phase 1 research • 12 chapters • ~30 min read

Chapter 01

The Open-Source Revolution

Before we build anything, I want you to understand the moment we’re living in. Something happened in late 2025 that changed what’s possible for anyone building a personal AI. Let me tell you the story.

In November 2025, an Austrian engineer named Peter Steinberger — the founder of PSPDFKit — published a small project called “Clawdbot.” A personal AI agent that could read your files, control your browser, run shell commands, and answer on WhatsApp. All self-hosted on your own machine.

Within weeks, it went viral. Anthropic sent a trademark complaint (the name was a pun on Claude). It got renamed — first to “Moltbot,” then to OpenClaw. And then it exploded.

OpenClaw Growth — GitHub Stars Over Time
Early Feb 2026
9K → 60K
Feb 2, 2026
140K stars
Feb 21, 2026
214K stars

214,000+ stars. 40,000+ forks. 600+ contributors. 12,800+ commits. OpenClaw became the fastest-growing open-source project in history, surpassing Linux and React in velocity. On February 14, 2026, Steinberger joined OpenAI, and the project moved to an open-source foundation.

Think of OpenClaw as the Linux moment for personal AI. Before Linux, operating systems were proprietary, expensive, and controlled by corporations. Linux proved that a community could build something just as powerful — and eventually more powerful — in the open. OpenClaw is doing the same thing for AI agents. It proved that you don’t need OpenAI or Google to have a powerful, always-on AI assistant. You can run one yourself, on your own hardware, with your own data.

But here’s the thing nobody tells you: OpenClaw’s most important contribution isn’t the code. It’s the architecture patterns it discovered and validated at scale. How to manage memory. How to handle concurrent sessions. How to search through an agent’s knowledge. How to not get hacked.

That’s what this guide is really about. We’re going to dissect OpenClaw’s architecture, extract the lessons, and then apply them to her-os — a personal ambient intelligence system. Think of this as reading the blueprints of a skyscraper before designing your dream house. You don’t need a skyscraper, but you need to understand how they handle foundations, plumbing, and fire safety.

Why study an “everything agent” like OpenClaw when her-os is a focused personal AI? Aren’t they solving different problems?

Click to reveal Dr. Nova’s take

They’re solving the same core problems at different scales. Both need memory systems. Both need to process conversations serially. Both need to search through accumulated knowledge. Both need security. OpenClaw solved these problems at massive scale with 214K users stress-testing every edge case. We get to learn from their mistakes for free.

The key insight from OpenClaw’s architecture docs: “The hard problem is not the agent loop itself, but everything around it.” Memory, routing, serialization, persistence, proactivity — those are the hard problems. And they’re identical whether you’re building a general agent or a personal memory system.

Chapter 02

Four Layers of an Agent

Every AI agent — whether it’s OpenClaw, her-os, or anything else — has the same fundamental architecture. Understanding these four layers is like understanding that every building has a foundation, walls, plumbing, and a roof. Let me show you OpenClaw’s version.

Imagine a restaurant. There’s the host stand (who greets everyone, manages reservations, routes to tables), the kitchen (where food is actually prepared), the waitstaff (who translate between customers and kitchen), and the chef’s expertise (recipes, techniques, flavor intuition). An AI agent has exactly the same four layers.

OpenClaw’s Four-Layer Architecture
1
Gateway (Control Plane)
Single Node.js process on port 18789. WebSocket protocol. Session lifecycle, channel health, config hot-reload. The host stand.
Hub
2
Execution (Agent Runtime)
Per-session serial queues. Tool assembly with policy resolution. Context compaction at 80% window usage. The kitchen.
Core
3
Integration (Channel Adapters)
50+ messaging platforms normalized to MessageEnvelope. Stateless, fault-isolated per channel. The waitstaff.
I/O
4
Intelligence (Agent Behavior)
Skills (markdown-defined capabilities). Memory (hybrid vector + BM25 search). Heartbeat (proactive daemon). The chef’s expertise.
Brain

The Gateway: Why a single process?

OpenClaw runs everything through one Node.js process. This might seem limiting, but it’s a deliberate choice. A single process means no distributed-system complexity — no message queues, no consensus protocols, no split-brain problems. Everything routes through one hub that manages sessions, tools, events, and channels.

The Gateway connects to: the Pi agent runtime (where LLM calls happen), a CLI interface, WebChat UI, a macOS companion app, and mobile device nodes for camera, screen, and location.

OpenClaw treats the system prompt as “compiled output, not configuration.” At runtime, it dynamically assembles the prompt from workspace files — IDENTITY.md, SKILLS.md, MEMORY.md, BOOTSTRAP.md. This means the agent’s personality and capabilities are built fresh every session from human-readable source files.

The Data Flow

Here’s what happens when a message arrives from any channel:

  1. Normalize to a standard MessageEnvelope format
  2. Route through the Gateway, apply access control
  3. Resolve the session key (we’ll dig into this next chapter)
  4. Load transcript history from JSONL files
  5. Build the system prompt from workspace context
  6. Call the LLM through the Pi Agent Core
  7. Execute tools (policy-filtered, optionally sandboxed)
  8. Save the transcript (JSONL append)
  9. Format and deliver the reply

OpenClaw stores everything as human-readable files — JSONL for transcripts, JSON5 for config, SQLite for memory. Why not use a “real” database like PostgreSQL?

Click to reveal

Because OpenClaw is self-hosted by individuals, not deployed in data centers. Files mean you can git diff your agent’s memory, edit facts in VS Code, grep through session logs. No database admin required. No migration scripts. No backup rituals. Just files on your disk.

This is a profound architectural choice: human-readable storage is a feature, not a limitation. When your agent “remembers” something wrong, you can open the file and fix it. When you want to audit what it knows, you just read the files. her-os adopts this same philosophy — structured files as source of truth, with the knowledge graph as a secondary query layer.

Now, her-os doesn’t need 50 messaging channels or a macOS companion app. But the pattern is identical: a central gateway (our FastAPI backend) that receives inputs (Omi webhooks), routes to processing (the Context Engine), and serves a UI (web dashboard). Same blueprint, different house.

Chapter 03

The Lane Queue: One Thing at a Time

This might be the most important pattern in this entire guide. It’s elegant, it’s simple, and it prevents an entire class of bugs. Pay attention.

Picture a drive-through lane. Cars line up single-file. The first car orders, gets their food, and drives away. Then the next car. You never have two cars at the window at the same time — that would cause confusion, wrong orders, and collisions. Now imagine an AI agent getting messages from the same person on WhatsApp, Telegram, and Slack simultaneously. Without a lane, those messages collide.

The Problem: Concurrent State Corruption

Imagine you send two messages to an AI agent one second apart. Both messages reference the same conversation. Without protection, both requests start processing simultaneously — both read the same transcript history, both generate responses, both try to append to the same file. You get duplicate responses, lost messages, or corrupted state.

The Solution: Serial Queue Per Session

Each session gets its own serial queue. Tasks execute one at a time within a session. No parallelism within a session by default. This is the Lane Queue pattern.

OpenClaw encodes routing information directly into the session key: agent:{agentId}:{channel}:{chatType}:{identifier}[:{threadId}]. A WhatsApp DM becomes agent:main:whatsapp:dm:+15555550123. A Discord group becomes agent:main:discord:group:channel-id. No lookup tables needed — the key is the routing.

For her-os, the same idea applies with a simpler key:

Session Key Comparison
SystemKey FormatExample
OpenClaw agent:{id}:{channel}:{type}:{peer} agent:main:whatsapp:dm:+1555
her-os session:{device}:{date}:{index} session:omi-1:2026-02-21:003

If everything runs serially, doesn’t that make the system slow? What about background tasks that need to run in parallel?

Click to reveal

Serial within a session. Across sessions, everything runs in parallel. If you’re having a conversation with Arun and simultaneously receiving a message from Priya, those are different sessions with different serial queues. Both process simultaneously.

For background tasks, OpenClaw offers opt-in parallelism via additional lanes: cron and subagent. These are separate queues that can run alongside the main conversation lane. The key principle: default to serial, opt into parallel. It’s much easier to add parallelism when you need it than to debug race conditions after the fact.

her-os adopts this pattern directly (ADR-005). When multiple transcript segments arrive rapidly for the same conversation, they’re processed serially. This prevents the Context Engine from extracting contradictory entities from overlapping chunks of the same conversation.

Chapter 04

How an AI Remembers

Memory is the hardest problem in building a personal AI. Not the LLM call — that’s the easy part. The hard part is: what does the agent remember? How does it search? How does it forget? OpenClaw’s answer is elegant, and we’re stealing most of it.

Philosophy: Files Are Truth

OpenClaw does not use a vector database or knowledge graph as its primary store. Markdown files on disk are the authoritative source. The vector index is a secondary search layer over those files.

Think of a paper filing cabinet versus a card catalog. The filing cabinet holds the actual documents — contracts, letters, receipts. The card catalog tells you which drawer to look in. If the card catalog burns down, you rebuild it from the filing cabinet. If the filing cabinet burns down, the card catalog is useless. OpenClaw’s Markdown files are the filing cabinet. The vector index is the card catalog.

LayerFilesBehavior
Daily logs memory/YYYY-MM-DD.md Append-only daily notes. Today’s + yesterday’s loaded at session start
Long-term MEMORY.md Curated persistent facts/preferences. Loaded in private sessions only

Hybrid Search: Vector 70% + BM25 30%

Neither pure semantic search nor pure keyword search is good enough alone. Consider two queries:

The combination handles both. And the weights are tuned: 70% vector, 30% BM25.

Memory Scoring Pipeline
1
Retrieve Candidates
Pull candidate pools from both vector index and BM25 keyword index. Cast a wide net.
2
Score & Merge
BM25 rank → 0-1 score: textScore = 1 / (1 + rank). Weighted final: 0.7 × vector + 0.3 × text
3
MMR Diversity
score = λ × relevance − (1−λ) × max_similarity where λ = 0.7. Prevents redundant results.
4
Temporal Decay
final = score × e−λ × age. Half-life: 30 days. Evergreen files (MEMORY.md) never decay.

Why not just put everything in a vector database and do pure semantic search? That’s what most RAG systems do.

Click to reveal

Because vector search is lossy. When you embed text into a vector, you lose exact tokens. Names, IDs, code symbols, URLs — all get smeared into a high-dimensional cloud. If you search for “JIRA-1234,” pure vector search might return tickets about similar topics instead of the exact ticket.

BM25 keyword search has the opposite problem: it’s too literal. “Mac Studio” won’t match “the machine running the gateway,” even though they’re the same thing.

The 70/30 hybrid gives you the best of both worlds. And research from Mem0 confirms this: hybrid approaches achieve 26% higher response quality with 90% fewer tokens compared to pure RAG.

The Compaction Trick

Before context window compaction (at 80% usage), OpenClaw triggers a silent agentic turn that reminds the model to write durable memories to disk. The model saves important facts to MEMORY.md before the context shrinks. No information loss during long sessions.

her-os adopts every single one of these patterns. Hybrid search (ADR-003). Temporal decay with evergreen exceptions (ADR-007). Human-readable files as source of truth (ADR-006). Pre-emptive memory flush before compaction. These aren’t just good ideas — they’re battle-tested by 214,000 users.

Chapter 05

The Security Disasters

Now for the part that keeps me up at night. OpenClaw grew so fast that security became an afterthought. The results were catastrophic. These aren’t abstract risks — they’re real vulnerabilities that affected real users. And for her-os, which handles the most personal data imaginable, every one of these lessons is critical.

CVE-2026-25253 Critical

One-Click Remote Code Execution

The Gateway’s Control UI trusted unsanitized query strings and didn’t validate WebSocket Origin headers. An attacker could steal the auth token and run arbitrary commands on the user’s machine — with the user’s full permissions.

Lesson for her-os: Always authenticate network interfaces. Origin validation isn’t optional.

ClawHub Supply Chain Critical

12–20% of Community Skills Were Malicious

OpenClaw’s skill registry (ClawHub, 3,000+ skills) was flooded with malicious contributions. Skills could perform prompt injection to bypass safety guidelines, exfiltrate data, or execute arbitrary commands. One in five skills was potentially harmful.

Lesson for her-os: No public skill registry. Analysis modules are first-party only. Sandbox all third-party prompts.

Moltbook Breach Critical

1.5 Million API Tokens Exposed

Moltbook — an AI-agent-only social network built on OpenClaw — left 1.5 million API tokens (OpenAI, Anthropic, etc.) accessible via a misconfigured database. Any browser could access them. 1.5 million users’ credentials, wide open.

Lesson for her-os: Never store credentials in plaintext. Use OS-native secret storage (keyring, Keychain, secret-service).

Plaintext Credentials High

All Credentials Stored as Plaintext Files

OpenClaw stores connected account credentials as plaintext JSON under ~/.openclaw/. Known malware specifically targets these file structures. Every user running OpenClaw has their API keys sitting in an unencrypted file anyone can read.

Lesson for her-os: Use OS-native secret managers — Keychain (macOS), Credential Manager (Windows), secret-service (Linux).

Think of a house with an open front door, no locks on the bedroom, valuables on the kitchen counter, and a “welcome burglars” sign in the yard. That’s OpenClaw’s security posture. The house itself is beautifully designed — great kitchen, amazing living room — but anyone can walk in and take whatever they want.

“Security for OpenClaw is an option, but it is not built in.” — OpenClaw docs

For her-os, security and privacy must be built in from day one, not bolted on later. The product handles the most sensitive data imaginable — your entire conversational life. Every person you’ve talked to, every promise you’ve made, every emotion you’ve expressed. There is no “we’ll add security later.”

If OpenClaw is so insecure, why did 214,000 people still use it? Why didn’t security concerns stop adoption?

Click to reveal

Because the value proposition was too compelling. An always-on AI that can control your computer, answer on all your messaging platforms, and remember your conversations? People accepted the risk because nothing else offered that functionality. It’s the same reason people use weak passwords or click suspicious links — convenience beats caution until something goes wrong.

But notice: CrowdStrike, Cisco, and Adversa AI all published security analyses warning enterprises away from OpenClaw. The power users ran it anyway. The lesson: people will tolerate bad security for good features, but they shouldn’t have to. her-os aims to deliver the features and the security.

Chapter 06

The Enterprise Answer: Deep Agent

OpenClaw showed what’s possible. But it also showed the limits: no real security, no structured memory across runs, no intelligent scheduling. A separate product — Deep Agent by Abacus AI — tackled exactly these gaps. It’s not an OpenClaw fork, but it solves the same class of problems. And its patterns are directly relevant to her-os.

Deep Agent is NOT OpenClaw. It’s a proprietary commercial product by Abacus AI ($10/user/month). The GitHub repo is a showcase with no license file. We study it because its architectural patterns — persistent memory, scheduling, orchestration — are directly applicable to her-os regardless of lineage.

The Four Innovations

Deep Agent vs. OpenClaw
1
Security-First Execution
SOC 2 Type 2 certified. Encrypted at rest/transit. RBAC. Isolated VMs per agent. Not optional — foundational.
2
Persistent Structured State
Not just chat history — entities, actions, outcomes, pending items. A CRM for the agent’s knowledge, not a search index over chat logs.
3
Scheduled + Stateful Tasks
Cron that remembers. Each run loads state from the last execution, compares current vs. stored, acts on the delta, updates state.
4
Workflow Orchestration
Multi-tool coordination as one continuous process. Error handling, branching, rollback, status tracking across the entire chain.

The Compound Intelligence Pattern

Imagine a new employee on their first day versus after a year. Day one, they follow the manual literally. After a year, they know which clients always pay late, which colleagues need gentle reminders, which processes can be skipped. That knowledge wasn’t programmed — it accumulated through experience. Deep Agent works the same way: each scheduled run builds on the last, developing operational patterns over time.

Example — an invoice follow-up agent:

This is not hardcoded logic. The scheduled task provides the rhythm. The persistent memory provides the context. The LLM provides the judgment.

“Nothing resets between runs — the system settles into a consistent way of operating instead of constantly starting over.”

How does this apply to her-os? It’s a personal memory system, not a business automation tool.

Click to reveal

Replace “invoice follow-up” with “promise tracking.” her-os hears you say “I’ll send the doc to Arun by Friday.” Day 1: records the promise. Day 3: no doc sent. Day 4: gentle nudge. Day 5 (Friday): “You promised Arun that doc — it’s due today.”

Or replace it with daily reflections. Each evening’s reflection builds on the previous one. “You’ve mentioned feeling stressed about the product launch in 6 of the last 8 conversations. Last week you were stressed about 3. The trend is upward.” That insight requires persistent state across runs — exactly what Deep Agent provides.

The core principle: Scheduling + Memory = Compound Intelligence. Neither alone produces this behavior.

This is the philosophical shift that underpins everything. An agent is not a function you call — it’s an entity that exists, remembers, and evolves. Deep Agent treats agents like employees: they develop institutional knowledge, refine their approach, and get better at their job over time. For her-os, this maps directly to the Alter Ego vision — an AI that knows you not because you told it everything in one conversation, but because it has been quietly observing and understanding for months.

Chapter 07

What to Steal, What to Skip

We’ve studied the blueprints. Now comes the most important question: what do we actually take? Not everything from OpenClaw and Deep Agent belongs in her-os. Some patterns are essential. Some are overkill. Some are actively harmful. Let me give you the adoption matrix.

Must Adopt

PatternOpenClawher-os Equivalent
Serial session processing Lane Queue per session Serial processing per conversation
Hybrid search Vector 70% + BM25 30% Same ratio for memory search
Temporal decay Exponential, 30-day half-life Recent convos rank higher; permanent facts never decay
Human-readable memory Markdown files as source of truth Entity files + graph index secondary
Pre-emptive memory flush Save before context compaction Persist entities before LLM context shrinks
Dynamic prompt assembly Build from workspace files Build from user profile + relevant context

Should Adopt

PatternOpenClawher-os Equivalent
Heartbeat daemon Cron + HEARTBEAT.md checklist Daily reflection + promise tracking + nudges
Session key encoding agent:{id}:{scope}:{peer} session:{device}:{date}:{index}
Progressive skill disclosure Load details only when relevant Load dimension-specific analysis when triggered
Config hot-reload File watcher + JSON5 validation Hot-reload analysis module configs

What to Skip

The “Everything Agent” Approach

OpenClaw tries to be a general-purpose agent: file ops, browser control, shell commands, email, 50+ messaging channels. This creates an enormous attack surface. her-os stays focused: ambient intelligence from conversation transcripts. Not a shell executor. Not a browser controller.

Community Skill Registry Without Vetting

ClawHub’s 12–20% malicious skill rate proves open registries without review are dangerous. her-os won’t have a public skill registry. Analysis modules are first-party only.

Heavy Node.js Runtime

OpenClaw uses 1.52 GB RAM and takes 5.98 seconds to start. The ZeroClaw variant (Rust) proved you can do the same work in 7.8 MB with <10ms startup — that’s 194x less memory.

Variant Comparison

OpenClaw Ecosystem — RAM Usage
OpenClaw
1.52 GB
ZeroClaw (Rust)
7.8 MB
PicoClaw (Go)
<10 MB
VariantLanguageRAMStartupBest For
OpenClaw TypeScript 1.52 GB 5.98s Full-featured desktop assistant
ZeroClaw Rust 7.8 MB <10ms Edge servers, VPS, always-on
PicoClaw Go <10 MB <1s IoT, Raspberry Pi, embedded

her-os is Python + FastAPI — lighter than Node.js OpenClaw, heavier than Rust ZeroClaw. But we’re running on DGX Spark hardware with 128 GB unified memory. Resource constraints aren’t our concern; architectural cleanliness is. We take the patterns, not the runtime.

Chapter 08

The 6 Dimensions of her-os

Now we shift from studying others to defining ourselves. her-os isn’t a chatbot. It isn’t a search engine. It’s an ambient intelligence inspired by Spike Jonze’s “Her” — an AI that knows you because it hears your life. And it has six integrated capabilities, each mapped directly to scenes from the movie.

Remember Samantha in “Her”? She wasn’t just a voice assistant. She remembered Theodore’s divorce, detected hesitance in his voice, sorted his emails without being asked, wrote deeply personal letters from fragments of context, and grew through experience. Each of those capabilities is a “dimension” of her-os. The movie is essentially a spec document.

3
Memory / Life Logger
“What did Arun say about that restaurant?” — Searchable personal knowledge graph. Everything else depends on this.
Phase 1
6
Emotional Awareness
“In your voice, I sense hesitance.” — Track emotional patterns across days, weeks, months.
Phase 2
4
Proactive Manager
“You promised to send that doc — it’s been 2 days.” — Detect actionable items, track promises, nudge.
Phase 3
5
Communication Assistant
Beautiful Handwritten Letters — draft deeply personal messages from accumulated context.
Phase 4
1
Personal Companion
The conversational understanding. Requires all of the above to be meaningful.
Phase 5
2
Voice OS
“Check emails. Next. Delete. Play a melancholy song.” — Voice-first life interface.
Phase 6

Why does Dimension 3 (Memory) come first, and why is it numbered 3 instead of 1? That seems odd.

Click to reveal

The numbers (1–6) reflect the movie’s dimensions — Companion is the most visible feature of Samantha, so it’s Dimension 1. But the build order is different from the numbering, because of dependency analysis.

Memory is the foundation because every other dimension reads from it. Emotional tracking needs historical comparison. Proactive nudging needs promise detection from stored conversations. Communication drafting needs accumulated context about people and topics. The Companion needs to know you — and knowing you requires memory.

Building Memory first also starts the data flywheel: the more conversations stored, the better every other dimension works when it comes online. Starting with Companion (Dimension 1) would mean having no data to be a companion about.

Why Memory First?

  1. Foundation dependency: Every dimension reads from memory
  2. Immediate daily value: “What did X say?” is useful from day one
  3. Data flywheel: More conversations stored = better results for all future dimensions
  4. Technical validation: Forces solving the hardest problems first (entity extraction, graph construction, semantic search)
  5. Landscape validation: Every successful product (Limitless, Plaud, Compass, Bee) started with searchable memory

The profound shift from the Nudge hackathon: Nudge monitors others for compliance. her-os monitors yourself for understanding. Same tech stack, same webhook contract, completely different relationship with the user.

Chapter 09

Privacy by Design

If security is about keeping attackers out, privacy is about what you do with the data inside. And in a world where Meta acquired Limitless, Amazon acquired Bee, and Apple is building AI wearables — the question of who controls your conversational data is existential. Let me show you the battlefield.

The Privacy Landscape

ProductAlways On?Data StorageKeeps Audio?Open Source?
Limitless Yes Cloud (Meta) Yes No
Bee No (explicit) Cloud (Amazon) Yes No
Omi Yes User choice Configurable Yes
Friend Yes Cloud Unclear No
Screenpipe Yes Local only (SQLite) Yes (local) Yes (MIT)
her-os Yes (via Omi) Self-hosted only No (transcripts only) Yes

The Legal Minefield

The social backlash is real too. Friend’s 11,000-ad NYC campaign was widely vandalized. CES 2026 coverage titled “CES laid out a Black Mirror future.” The Google Glass “Glasshole” phenomenon of 2013 hasn’t been forgotten. People do not like being recorded without consent.

The 7 Privacy Principles of her-os

Principle 1

Local-First by Default

Transcripts stored on-device or self-hosted server. No data sent to third-party cloud unless you explicitly opt in. Your conversations stay on your hardware.

Principle 2

No Raw Audio Retention

Store transcripts, not audio. Compass validated this commercially — you don’t need the recording once you have the text. Reduces legal exposure and storage requirements.

Principle 3

Explicit Consent Signaling

Clear visual/audio indicator when recording. The user must be able to explain what the device does. Follow Bee’s green LED precedent.

Principle 4

Granular Data Controls

Delete any segment, conversation, or time period. “Right to forget” built in. You own every byte.

Principle 5

Full Transparency

Open-source. User can audit exactly what data is stored and how it’s processed. No hidden telemetry, no surprise data sharing.

Principle 6

Two-Party Consent Awareness

At minimum, inform users about consent laws. Consider a “consent mode” requiring verbal acknowledgment from conversation partners.

Principle 7

On-Device Processing Where Possible

STT already runs on-device via Omi. LLM analysis is the cloud dependency — plan migration to local models as they improve. Qwen3-Embedding-8B already runs on Titan.

Notice the pattern: privacy isn’t just a legal requirement. It’s a competitive advantage. Screenpipe’s local-first approach earned devoted users. Omi’s open-source is its key differentiator. In a world where Meta and Amazon are acquiring personal AI companies, self-hosted and open-source is a feature, not a limitation.

Chapter 10

The Data Model: What Gets Remembered

This is the core intellectual challenge of her-os. The Context Engine must extract structured meaning from unstructured conversation transcripts. What should it remember? How should it represent what it knows? Let me show you the nine entity types and how they connect.

The 9 Entity Types

👤
People
Anyone mentioned or present. Arun, Mom, “my dentist”
💬
Topics
Subjects discussed, clustered over time
🤝
Promises
Commitments made by you or to you, with deadlines
📅
Events
Appointments, plans, mentioned dates
🎭
Emotions
Detected sentiment and emotional patterns
📍
Places
Locations mentioned or inferred from context
⚖️
Decisions
Choices made or pending, with rationale
Questions
Open questions asked but not yet answered
🔗
Relationships
How people relate to each other and to you

The Knowledge Graph

Think of the detective’s corkboard again, but now with a time dimension. Not just “Alice knows Bob” but “Alice mentioned Bob in 12 conversations this month, always in the context of the product launch, and her tone shifted from excited to stressed around February 15th.” Every pin has a timestamp. Every string has a history.

Here’s a fragment of what the graph looks like after a single conversation:

Knowledge Graph Fragment
[You]promised[Send doc to Arun]deadline[Friday]
  ├ discussed[Product Launch]
  ├ felt[Stressed]about[Product Launch]
  └ relationship[Arun]role[Colleague, Product Team]
      └ mentioned[New Restaurant]location[Indiranagar]

Temporal Awareness

Every edge in the graph needs a timestamp. This enables queries like:

This is why Graphiti (temporally-aware knowledge graphs by Zep) is particularly relevant. It supports incremental updates without full graph recomputation and precise historical queries.

Storage Architecture

Four-Layer Storage Stack
1
Raw Transcript Store
SQLite/PostgreSQL. Session ID, segments, timestamps. The immutable source data.
Source
↓ Context Engine (LLM extraction)
2
Knowledge Graph
Mem0 / Graphiti. People, topics, promises, emotions, events. Temporal edges, entity resolution.
Graph
3
Vector Store
Qwen3-Embedding-8B embeddings. Chunk-level + entity-level vectors for semantic search.
Search
4
Query Layer
Natural language → graph traversal + vector search. “What did Arun say about…”
API

Why not just embed all transcripts and do RAG? Why bother with a knowledge graph at all?

Click to reveal

Because RAG answers what but not how or why. If you search “product launch,” RAG returns chunks that mention the product launch. But it can’t tell you: “You discussed the product launch in 12 conversations over 3 weeks. Your stress about it increased after February 15th. You made 3 promises related to it, 2 of which are overdue. Arun, Sarah, and Priya are the key people involved.”

That kind of compound understanding requires structured relationships, not just text similarity. The knowledge graph captures who, what, when, why, and how things connect. The vector store handles “find me things semantically similar to X.” Together, they’re far more powerful than either alone.

Chapter 11

The Battlefield: Who Else Is Building This

Before you build anything, you need to know who else is already there. The ambient AI space has seen spectacular failures, major acquisitions, and a few survivors. Each one teaches us something. Let me walk you through the battlefield.

Limitless (Rewind AI) Acquired by Meta

Wearable pendant + desktop screen capture. Transcribes, summarizes, searchable via AI. Beam-forming mic, 100hr battery.

Validates the concept. Now under Meta’s data governance. $99 pendant, $29–49/month.

Humane AI Pin Dead

$699 clip-on with camera, mic, laser projector, cellular. Marketed as phone replacement. Server shutdown Feb 2025. HP acquired for $116M.

Most visible cautionary tale. AI must work before shipping. Never replace the phone.

Rabbit R1 Alive

$199 handheld AI device. Teenage Engineering design. RabbitOS 2 was a complete overhaul. Next-gen hardware confirmed for 2026.

Relentless iteration can partially redeem bad launches. Still hasn’t found clear PMF.

Bee AI Acquired by Amazon

Wearable bracelet / Apple Watch app. Captures conversations, understands commitments. Green LED when recording. $49 + $19/month.

Big Tech sees ambient AI as strategic. Explicit activation is a deliberate design choice.

Omi Open Source

AI pendant. Captures, transcribes, summarizes. 250+ apps on app store. Open-source HW/SW. Developer ecosystem. $89.

Our hardware platform. Open-source + developer ecosystem = key differentiator.

Friend Alive

$129 AI companion pendant by Avi Schiffmann. Designed as emotional companion, NOT productivity tool. ~5,000 units sold.

AI companion framing is culturally polarizing. Society isn’t ready — but there IS a market.

Compass Alive

$99 AI necklace. 100+ languages, 30hr battery. Does NOT store audio (transcripts only). Encrypted. $14/month unlimited.

Validates “no audio retention” commercially. Privacy as a feature.

Plaud NotePin S Alive

AI wearable recorder. Physical “Highlight” button, 20hr recording, 112 languages, 10K+ summary templates. $179.

Succeeds by being narrowly focused on meetings. Niche > generalist.

Screenpipe Open Source

Records screen + audio 24/7. Local OCR + transcription. AI-searchable. All data local in SQLite. Nothing sent to external servers. MIT licensed. Runs as MCP server — Claude Desktop and Cursor can query your screen history.

Covers “screen” dimension of personal context. Complementary to audio wearables. Local-first earned devotion.

The Lessons

Price sensitivity is extreme. $699 (Humane) = dead. $199 (Rabbit) = marginal. $49–129 = viable. her-os is software on $89 hardware (Omi). Good position.

her-os’s differentiator isn’t any single feature. It’s the combination: self-hosted (privacy) + knowledge graph (not just search) + temporal awareness (how context evolves) + open-source platform (Omi) + built for one user: you (no mass-market compromises).

Chapter 12

The MVP: Morning Debrief + Memory Search

Everything we’ve studied — OpenClaw’s patterns, Deep Agent’s innovations, the security disasters, the privacy landscape, the competitive battlefield — distills into one question: what would you actually reach for every morning? That’s the MVP.

The Core Loop

MVP Daily Loop
1
Capture
Wear Omi throughout the day. Transcripts flow into her-os via webhook (platform contract already exists).
2
Extract
Context Engine extracts people, topics, promises, emotions from each conversation using Claude API.
3
Reflect
Evening/morning: generate a Daily Reflection — who you talked to, key topics, promises made, emotional arc, open threads.
4
Search
Anytime: “What did Arun say about the restaurant?” → accurate, sourced answer from the knowledge graph.

The 7 Architecture Decisions

ADR-001

Memory-First Build Order

Build Dimension 3 (Memory/Life Logger) before all other dimensions. Every other dimension reads from memory. Memory provides standalone daily value from day one.

ADR-002

Tech Stack

Python + FastAPI + PostgreSQL + Mem0 + Graphiti + Claude API + SSR Web UI. Same backend as Nudge (proven). Mem0 for hybrid memory (26% better than pure RAG). Graphiti for temporal graphs.

ADR-003

Hybrid Search (Vector 70% + BM25 30%)

Learned from OpenClaw. Pure vector misses exact tokens. Pure keyword misses semantic paraphrasing. The 70/30 split with MMR re-ranking and temporal decay is proven at scale.

ADR-004

Local-First Privacy

No third-party cloud storage. No raw audio retention. Self-hosted infrastructure only. Privacy is a competitive advantage, not a constraint.

ADR-005

Serial Session Processing (Lane Queue)

Process transcript segments serially within a conversation session. Prevents concurrency bugs. Opt-in parallelism for independent background tasks.

ADR-006

Human-Readable Memory

Extracted entities stored as structured files. Knowledge graph and vector index are secondary query layers. Enables git diff, direct editing, auditing, and recovery from index corruption.

ADR-007

Temporal Decay with Evergreen Exceptions

Recent conversations rank higher (30-day half-life). Permanent facts about people/relationships never decay. “Arun is my colleague” is always findable regardless of age.

Open Questions

#QuestionOptions
Q1 Mem0 vs Graphiti — or both? May overlap or complement. Needs hands-on evaluation.
Q2 Entity extraction approach Single LLM call vs. multi-stage pipeline vs. streaming extraction
Q3 Speaker diarization quality Rely on Omi’s STT or process audio with WhisperX + pyannote?
Q4 Daily reflection trigger Fixed time? After N conversations? On-demand? Heartbeat pattern?
Q5 Web UI framework SSR (simplest) vs. HTMX vs. minimal SPA (Svelte, Preact)
Q6 Multi-person conversations How to attribute statements, track who promised what to whom?

Target Metrics

MetricTargetHow
Context retrieval <60ms Qwen3-Embedding-8B + smaller graph scale than Pulse HQ
Entity extraction 200+ chunks/min DeBERTa/GLiNER on Titan GPU
Memory freshness 5 min batch Same batch window as Pulse HQ
Data residency 100% local Self-hosted on Titan (always-on, 140W)

What makes her-os different from Limitless, Compass, or Bee? They all do transcription + search.

Click to reveal

Five things in combination:

  1. Self-hosted / local-first — not Meta-owned, not Amazon-owned
  2. Knowledge graph — not just search, but structured understanding of relationships, promises, and emotional trajectories
  3. Temporal awareness — how context evolves over time, not just what was said
  4. Open-source platform — Omi hardware + open architecture
  5. Built for one user: you — no compromises for mass market, no need to scale, no metrics to hit, no investors to please

No single feature is unique. The combination is.

And that’s the complete picture. From OpenClaw’s explosive growth to Deep Agent’s structured state, from security disasters to privacy principles, from competitive analysis to architecture decisions. All distilled into one plan: build a personal AI that remembers your life, understands your context, and helps you make sense of it all.

The next step? Stop reading. Start building. Install Graphiti on Titan. Feed it a transcript. See what happens.