LENS 26 — QUESTION HORIZON: CROSS-LENS CONNECTIONS
===================================================

PRIMARY CONNECTIONS
-------------------

LENS 01 (Constraint Hierarchy — Temporal Surplus as Free Signal):
  Lens 01 identified that Annie's 58 Hz surplus creates "temporal free signal" — far
  more frames per second than needed for basic navigation decisions. It catalogued
  constraints at multiple levels: physics → convention → dissolved. Lens 26 provides
  the next-order question that Lens 01 could not ask: now that the surplus exists, what
  is the optimal allocation? Branch 1 of Lens 26 asks whether alternating-query dispatch
  at 29 Hz nav + 10 Hz scene + 10 Hz obstacle is the best distribution, or whether
  there is a discovery-based optimal split that changes with room type (cluttered living
  room vs empty hallway requires different allocation ratios). Lens 01's "dissolved
  constraint" category should now include: "temporal interleaving enables multi-task
  perception without frame-level parallelism." But Lens 26 adds the caveat: the
  interleaving may introduce task-lag artifacts (frame 3 obstacle report about a
  moment captured between frames 2 and 4's nav queries). That artifact was not visible
  as a question until the multi-query pipeline was proposed. Cross-citation: Lens 01's
  86 ms EMA window is the same 86 ms that Lens 26 Branch 2 identifies as the
  almost-answered "EMA vs sensor fusion" question. The temporal surplus and the EMA
  window are two descriptions of the same design variable.

LENS 05 (Value Mapping — Privacy and Behavioral Signals):
  Lens 26 Branch 3 asks whether Annie's semantic map transfers between homes. This
  question has an immediate privacy consequence that Lens 05 should pick up. A
  semantic map is not just spatial geometry — it encodes behavioral patterns. "Kitchen
  frequently visited between 7:00 and 9:00 AM" is a health signal. "Bedroom entered
  at irregular hours" is a behavioral signal. "Bathroom pattern changed in week 3" is
  a medical signal. If that map is uploaded as a product SKU and transferred to new
  users, it carries those behavioral embeddings in latent form. The fraction of the
  map that is "universally transferable" may be higher than expected precisely because
  universal semantic anchors are correlated with universal behavioral patterns.
  Cross-citation: Lens 05's privacy model needs to explicitly address the "map as
  product" scenario that Lens 26 Branch 3 introduces. The question "what is transferable
  in the semantic map?" is also the question "what behavioral signals escape the home?"

LENS 08 (Analogy Bridge — Neuroscience Mechanisms):
  Lens 26 Branch 5 (outsider question: "why does the robot need language?") and Lens
  08 converge on the same architectural observation from different directions. Lens 08
  notes that rat hippocampal place cells encode spatial identity directly as activation
  patterns, not as verbal descriptions. Lens 26 Branch 4 notes that text2nav achieves
  74% navigation success using frozen SigLIP embeddings alone. Both observations point
  at the same thing: the vision encoder's activation space is a sufficient
  representation for spatial navigation; the text-decoding step is an unnecessary
  intermediate. Lens 08's hippocampal replay mechanism (recommended implementation:
  store vision embeddings keyed by SLAM pose during overnight consolidation) is the
  architectural implementation of Lens 26's "bypass text" convergence finding.
  Cross-citation: Lens 08 explicitly recommended that replay-time processing store
  vision encoder embeddings (not text descriptions) keyed by SLAM pose. Lens 26
  confirms this from three independent question branches. The two lenses are
  co-specifying the same architecture from different starting points.

LENS 14 (Historical Pattern — Research Describes Waymo, Does the Opposite):
  Lens 14 found that the research describes the Waymo pattern (lidar-primary, camera
  supplementary) and then implements the opposite (VLM-primary, lidar supplementary).
  Lens 26 finds an analogous inversion: the research builds toward language-grounded
  semantic maps (VLMaps pattern, CLIP embeddings, language-queryable map) and
  simultaneously identifies reasons to remove language from the Tier 2 perception loop.
  Both cannot be maximally true. The VLMaps architecture requires language at the map
  query interface ("where is the kitchen?") but not at the frame-processing interface
  ("what is in this frame?"). Lens 26 makes the distinction explicit: language at
  Tier 1 (strategic goal interpretation) is load-bearing; language at Tier 2 (tactical
  frame processing) is a relay station that adds latency and hallucination risk without
  contributing to navigation accuracy. The architectural resolution is: keep text at
  Tier 1, bypass text at Tier 2. This resolution was not articulable from inside the
  research because the inversion (Lens 14's finding) was not visible. Cross-citation:
  Lens 14's "research describes X then does not-X" pattern is a general signal that
  the team's explicit commitments (use language-grounded maps) and its implicit
  architecture (VLM-primary, embedding-fast-path) are in tension. Lens 26 Branch 5
  makes that tension into a question: which commitment should win?

SECONDARY CONNECTIONS
---------------------

LENS 02 (Abstraction Leak — Pico RP2040 REPL Crash as Invisible Failure):
  Lens 02 identified that the most dangerous failures in Annie's stack are invisible
  abstraction leaks — the IMU crash that silently degraded navigation without any
  error message. Lens 26 Branch 5 introduces a new potential invisible failure:
  if text decoding is removed from Tier 2 and replaced with embedding-to-command
  linear probes, the failure mode changes from "VLM hallucinated LEFT when obstacle
  was on the right" (human-readable, debuggable from logs) to "embedding distance
  0.73 fell below threshold for sector 2" (numeric, requires visualization tooling
  to debug). The abstraction leak risk increases when text is bypassed. Lens 26's
  outsider question about explainability cost directly instantiates Lens 02's concern
  about invisible failures. Any implementation of "bypass text layer" must include
  an explicit observability plan: what replaces the "VLM said LEFT MEDIUM" log entry?

LENS 03 (Dependency Audit — llama-server Embedding Blocker):
  Lens 03 identified the llama-server embedding extraction blocker as the highest-
  leverage addressable dependency in Annie's stack. The research itself notes that
  llama-server does not cleanly expose intermediate embeddings for multimodal inputs,
  and recommends deploying a separate SigLIP 2 ViT-SO400M (~800 MB VRAM) as a
  dedicated embedding extractor. Lens 26 Branch 1 (task-parallelism questions) and
  Branch 4 (cross-field, text-free architecture) both depend on resolving this
  blocker first. The SigLIP 2 deployment is the prerequisite for every branch of the
  "bypass text" convergence finding. Cross-citation: Lens 03's blocker analysis gives
  the sequencing for Lens 26's convergence implementation: (1) deploy SigLIP 2 on
  Panda; (2) profile text-decode vs embed-only latency; (3) train linear probe;
  (4) A/B test. None of steps 2-4 are possible until step 1 unblocks the embedding
  extraction path.

LENS 04 (Network Topology — WiFi Cliff Edge at 100ms):
  Lens 26 Branch 3 (semantic map transfer) has an important WiFi dependency that
  Lens 04's findings constrain. Semantic map transfer (uploading/downloading concept
  graphs between homes) requires network connectivity for the initial transfer, but
  runtime navigation must remain WiFi-independent. The "map as product" scenario
  Lens 26 introduces needs to be designed so that the transferred concept graph is
  fully cached locally on Annie before deployment, not streamed at runtime. Lens 04's
  cliff-edge finding (navigation degrades sharply above 100ms RTT) means any
  architecture where the transferred map requires Titan or cloud lookups during
  active navigation is fragile. Cross-citation: the map transfer architecture must
  specify: (a) transfer happens during setup/charging, not during runtime; (b) the
  on-Pi cache of concept embeddings is the runtime source; (c) Titan enrichment of
  the map happens offline (Lens 08 hippocampal replay pattern).

LENS 10 (Post-Mortem — "Built the Fast Path, Forgot the Slow Path"):
  Lens 10's post-mortem finding ("we built the fast path, forgot the slow path") has
  a direct parallel in Lens 26's convergence finding. The "fast path" (text decoding
  at 58 Hz, human-readable nav commands) was built because it was the natural output
  format for a language model. The "slow path" (embedding extraction, cosine
  similarity, linear probe training on labeled frames) was not built because it
  required a separate infrastructure step (SigLIP 2 deployment, a data pipeline for
  frame logging, a training loop for linear probes). The text layer is the fast path.
  The embedding layer is the slow path. The research proves the fast path works.
  The slow path — which the convergence finding suggests may work better — was not
  forgotten by negligence but by the same structural reason Lens 10 identified:
  the slow path requires planning and infrastructure that the fast path does not.
  Cross-citation: Lens 10's retrospective framing ("what went wrong and when?")
  provides the narrative for why the text layer was chosen — it was the fast path
  to a working system. Lens 26 provides the prospective framing: now that the fast
  path works, the slow path is the next investment.

NOVEL PREDICTIONS FROM QUESTION HORIZON (not in any other lens)
---------------------------------------------------------------

1. THE SEMANTIC MAP BUSINESS MODEL:
   No other lens addresses the commercial consequence of semantic map transfer.
   If Annie's concept embeddings (not coordinates) are the map, and if 60-70% of
   those embeddings are universal (home-layout patterns that repeat across all homes),
   then the transferred concept graph is a form of pre-trained spatial knowledge.
   A new user's Annie would not start from a blank SLAM occupancy grid — she would
   start from a graph where "kitchen-ness," "bathroom-ness," and "hallway-ness" are
   already recognized. The exploration-to-functional ratio (how long before Annie
   is useful in a new home) drops from weeks to hours. This is not a navigation
   research finding. It is a product design finding that the navigation research makes
   askable for the first time.

2. THE EXPLAINABILITY TRADE:
   No other lens explicitly frames the text-bypass as a trade between navigation
   accuracy and debugging transparency. Every other lens that mentions bypassing text
   (Lens 08, Lens 14) treats it as a pure performance improvement. Lens 26 Branch 5
   adds the countervailing concern: text-mediated nav is debuggable by inspection.
   Embedding-mediated nav requires visualization tooling, dimensionality reduction,
   and human-in-the-loop evaluation of cosine similarity thresholds. The transparency
   trade is a product design and safety concern that must be decided explicitly, not
   defaulted. The question "is the text layer retained for debugging convenience
   or for navigation performance?" is a governance question that no other lens raised.

3. THE TASK-MINIMUM FREQUENCY QUESTION:
   Branch 1 asks: what is the minimum nav frequency before task performance degrades?
   No empirical answer exists. The research proposes 29 Hz nav (frames 0,2,4 in a
   6-frame cycle). But if 15 Hz nav works, the remaining 43 Hz can be allocated to
   embedding extraction and place recognition, enabling real-time topological map
   building without a separate SigLIP 2 model. This would collapse Phase 2d (embedding
   extraction) into Phase 2a (multi-query pipeline) — a significant simplification
   that requires only one measurement: what is Annie's minimum viable nav frequency?
   No other lens identified this as the highest-leverage measurement in Phase 2a.

4. THE LANGUAGE-GEOMETRY MISMATCH:
   The outsider question from Branch 5 — "why does the robot need to understand
   language?" — identifies a category error in the architecture's design rationale
   that no other lens surfaced. Navigation is a geometric problem. Language is a
   communication protocol. Using a language model for navigation is appropriate if
   the mapping is: (human language) → (robot action). It is a detour if the mapping
   is: (visual embedding) → (text string) → (robot action), where the text string
   adds no information that the visual embedding did not already contain. Lens 26 is
   the only lens that makes this category distinction explicit. It suggests that
   the language model in Annie's stack should be evaluated against two different
   standards: (a) does it help the robot navigate? (b) does it help the human
   communicate with the robot? The answer to (a) may be "not as much as we thought"
   while the answer to (b) remains "yes, significantly." The architecture should
   reflect this distinction by separating the two functions, not conflating them in
   a single VLM inference call.

SESSION 119 ADDITIONS — DUAL-PROCESS HORIZON CROSS-LENS
--------------------------------------------------------

LENS 03 (Dependency Audit — llama-server Embedding Blocker) [SESSION 119 UPDATE]:
  Lens 03's blocker analysis gains a new entry from session 119: the Hailo-8 AI HAT+
  on Pi 5 is a 26 TOPS NPU that is idle for navigation. It is not a blocker in the
  "missing capability" sense; it is the opposite — an unclaimed capability. Lens 03
  should register it as a negative-space dependency: every frame that Hailo does not
  process is a frame the VLM must process, and every VLM frame costs 25-40 ms of WiFi
  plus inference latency that Hailo would deliver in under 10 ms locally. The
  dependency audit is now: (1) llama-server embedding path (pre-existing, for Tier 2
  text bypass); (2) HailoRT/TAPPAS toolchain on Pi 5 (new, for System 1 activation);
  (3) open-vocabulary detector compilation to Hailo format (new, unresolved — NanoOWL
  compatibility unverified). Cross-citation: Lens 26 Branch 6 asks whether Hailo can
  run NanoOWL-lite. The answer determines whether Lens 03's dependency list has two
  or three items.

LENS 08 (Analogy Bridge — Neuroscience Mechanisms) [SESSION 119 UPDATE]:
  Lens 08's dual-process observation (fast instinctual detection + slow deliberative
  reasoning, mapped to rat hippocampus + prefrontal cortex) is directly validated by
  the IROS arXiv 2601.21506 result that session 119 surfaced. System 1 (30+ Hz
  SegFormer/YOLO) + System 2 (1-5 Hz VLM) = 66% latency reduction, 67.5% success vs
  5.83% VLM-only on indoor robot navigation. The Lens 08 neuroscience analogy is no
  longer speculative — it is a peer-reviewed architectural pattern with measured
  performance advantages. Cross-citation: Lens 08 should upgrade its "fast-slow split"
  recommendation from "inspired by hippocampus" to "validated by IROS 2601.21506" and
  add the Hailo-8 as the concrete implementation substrate for System 1.

LENS 24 (Resource Allocation Under Uncertainty) [SESSION 119 CREATES]:
  Lens 24 now has an explicit calibration question from session 119: the tuning
  question ("at what VLM query rate does System 2 gating outperform always-on VLM?")
  and the layer-ratio question ("what are the optimal relative Hz for L1/L2/L3/L4?")
  are both resource-allocation questions under uncertainty. IROS gives one answer for
  their setup; Annie's specific allocation is unmeasured. Lens 24 should frame the
  measurement strategy: sweep L2 rates from 1 Hz to 27 Hz on identical routes while
  L1 runs at 30+ Hz, measure success rate and p95 decision latency, fit the crossover
  point. This is the same kind of sweep Lens 24 uses for GPU memory allocation but
  applied to inference-rate allocation across a dual-process stack. Cross-citation:
  Lens 26 Branch 6 tuning question is a Lens 24 experimental design problem.

LENS 25 (Meta-Questions / Process vs Design) [SESSION 119 STRENGTHENS]:
  Lens 25's "process success vs design success" distinction is exactly the lens
  session 119 applied. The Hailo-8 activation was a process success (audit surfaced
  a pre-existing resource), not a design success (nobody designed Annie around
  Hailo). Lens 25 should now catalogue process-success patterns and their triggers.
  The trigger in session 119 was: a targeted hardware-audit pass run alongside a
  literature sweep on dual-process navigation. The meta-question Lens 26 Branch 6 introduces —
  "what other idle compute is in the household?" — is a Lens 25 process-instrument.
  It is not a design question. It is a question about the investigation regime.
  Cross-citation: Lens 25 and Lens 26 Branch 6 share the same output artifact: an
  explicit inventory of Panda (active), Titan (active), Beast (idle), Orin NX 16GB
  (idle), plus unaudited tiers (phones, laptops, TV SoCs, router NPUs). Lens 25
  should maintain this inventory as a durable household-compute registry appendix.

CONVERGENCE SUMMARY
-------------------

The three-branch convergence on "bypass text layer" is the most important finding
in Lens 26. It is worth restating precisely:

  Branch 1 (task-parallelism): "What if VLM outputs embeddings instead of text?"
    → Vision encoder alone at 71 Hz (14 ms, no 4 ms decode).
    → Enables true task-parallel allocation without interleaving artifacts.

  Branch 3 (map transfer): "What if SLAM cells stored embeddings instead of text labels?"
    → Transferable semantic maps (embeddings vs coordinates).
    → Enables the "map as product" business model.

  Branch 4 (cross-field): "What if place recognition used raw ViT features?"
    → Text2nav: 74% success with frozen SigLIP embeddings alone (RSS 2025).
    → Connects Annie's architecture to the animal navigation and embodied AI literature.

All three branches make the same architectural recommendation:
  - Keep text at Tier 1 (strategic goal interpretation — language IS the interface).
  - Bypass text at Tier 2 (tactical frame processing — language is a relay station).

The implementation path (from Lens 03's dependency analysis):
  1. Deploy SigLIP 2 ViT-SO400M on Panda (~800 MB VRAM).
  2. Profile text-decode latency vs embed-only latency separately in llama-server.
  3. Train 3-layer linear probe on Annie's 6-month labeled frame log.
  4. A/B test: embedding path vs text path on identical navigation routes.
  5. Decide explicitly on explainability trade before committing to text-free Tier 2.

The convergence is not coincidence. It reflects that the text layer was inherited from
the model class (Vision-Language Model) rather than designed for the task (geometric
navigation). The research created the conditions to ask whether that inheritance is
load-bearing or incidental. Lens 26 confirms it is the right question to ask next.

Session 119 widens the convergence. Before committing to a text-free Tier 2, two new
prerequisite questions must be answered:

  - At what VLM query rate does System 2 gating outperform always-on VLM? (Tuning)
  - Can Hailo-8 run open-vocabulary detectors? (Capability)

If the tuning crossover is below 15 Hz, Annie's 54 Hz VLM is over-budget and the
dual-process split is the first-order architectural move — ahead of text bypass. If
Hailo supports open-vocabulary detection, L1 absorbs part of the goal-tracking load
that currently sits in Tier 2, changing what Tier 2 needs to be and therefore what
its right representation is.

The durable output of session 119 is the meta-instrument: "what else is idle?" Apply
it on Beast, Orin NX 16 GB, phones, laptops, TV SoCs, and router NPUs. The next
invisible resource is waiting for the next targeted audit. The question horizon is
not just about new questions the primary research made askable. It is also about
new questions that targeted hardware-inventory passes make askable — questions
whose answers depend on resources that were invisible before the audit forced them
into view.