LENS 04

Sensitivity Surface

"Which knob matters most?"

PARAMETER SENSITIVITY — EFFECT ON NAVIGATION RELIABILITY
WiFi latency (semantic path)
95%  ⚠ CLIFF EDGE
Motor speed (turns)
90% — catastrophic
Sonar ESTOP threshold
85% — binary gate
EMA alpha (smoothing)
70% — noisy or laggy
llama-server prompt format
60% — output parsability
SLAM map resolution
30% — forgiving
Multi-query cycle count
25% — wide optimum
VLM rate (above 15 Hz)
20% — surprisingly flat
WiFi latency (safety, post-Hailo)
15%  ✓ MITIGATED (L1 local)

⚠ = discontinuous cliff edge  |  coral = catastrophic  |  amber = significant  |  green = forgiving

WiFi latency WAS the one knob that could silently kill the system — and it had a cliff edge. Below 30ms the nav loop runs cleanly: VLM inference takes 18ms, command round-trip adds another 15ms, and total loop time stays under 50ms. Between 30ms and 80ms there is meaningful but recoverable degradation — the EMA filter absorbs the jitter, the robot slows slightly, and collisions remain rare. Then at approximately 100ms the system crosses a discontinuity. At 1 m/s, 100ms of WiFi adds 10cm of positional uncertainty per command — roughly half a robot body width. More importantly, three or four stacked latency spikes push the nav loop's total delay past 150ms, which is long enough for a chair leg to appear in the robot's path between when the VLM saw clear space and when the motor command actually fires. Lens 01 identified temporal surplus as this system's primary free resource. WiFi above 100ms does not erode that surplus — it annihilates it. Lens 10's failure pre-mortem named WiFi as the "boring" production failure mode precisely because it looks fine in testing on a clear channel and then causes mysterious incidents when a microwave or neighboring network is active.

The cliff edge has now been split in two by a discovery from Lens 25 (idle hardware). Annie's Pi 5 carries a Hailo-8 AI HAT+ — a 26 TOPS neural accelerator that has been sitting unused for navigation. Activating it gives the safety layer a WiFi-independent path: YOLOv8n runs locally at 430 FPS with <10ms latency, producing pixel-precise obstacle bounding boxes without a single packet traversing the network. The IROS paper at arXiv 2601.21506 validates this split experimentally for indoor robot nav — a fast local System 1 paired with a slow remote System 2 cuts end-to-end latency by 66% and lifts task success from 5.83% (VLM-only) to 67.5% (dual-process). With Hailo-8 active, obstacle avoidance no longer depends on WiFi at all, so the bar for the safety path drops from 95% cliff-edge coral to 15% green — a forgiving parameter instead of a catastrophic one. The cliff edge still exists, but only for the semantic path: "where is the kitchen?", "what room is this?", "is the path blocked by a glass door?" — queries that require open-vocabulary VLM reasoning on Panda. Those will always traverse WiFi, but they are never the thing that lets a chair leg hit the chassis. The knob that could kill the robot has been converted into a knob that can merely slow its higher cognition. This is a qualitative change in the failure surface.

Motor speed for turns is the second catastrophic parameter. The system already has a concrete data point: at motor speed 30, a 5° turn request produces 37° of actual rotation — a 640% overshoot driven by momentum that the IMU reads only after the motion has completed. This is not a smooth gradient. Below a certain threshold of angular momentum the robot stops where commanded; above it, the momentum carries the chassis far past the target before the motor loop can intervene. The transition between these regimes is sharp enough that even a 5% increase in motor speed can flip a precise trim maneuver into a full spin. Homing and approach sequences that rely on small corrective turns are particularly vulnerable because they begin with a large accumulated error and then apply a correction that itself overshoots — producing oscillation. The fix is mechanical (coast prediction or pre-brake) but until it lands, motor speed for turn commands must be treated as a first-class production hazard on par with WiFi latency.

EMA alpha and prompt format sit in the medium band — important but non-catastrophic. The smoothing constant alpha=0.3 was chosen because it filters single-frame VLM hallucinations (which happen roughly once every 20–30 frames on cluttered scenes) without introducing more than ~100ms of effective lag. Tuning alpha upward toward 0.7 eliminates hallucinations but makes the robot slow to respond to a genuine doorway appearing in frame — a 300ms effective lag at 58Hz. Tuning it downward toward 0.1 lets every flicker through. This is a U-shaped optimum with a clear best region rather than a cliff edge: it degrades gradually in both directions. Prompt format for llama-server is similarly forgiving in that small phrasing changes leave output parsability intact, but wholesale changes to the token structure (e.g., asking for a JSON object instead of two bare tokens) reliably break the 3-strategy parser and must be tested end-to-end before deployment.

The most surprising finding is how insensitive VLM frame rate is above 15 Hz. At 1 m/s, two consecutive frames captured 1/15th of a second apart differ by only 6.7cm of robot travel. The VLM's single-token output — LEFT, CENTER, or RIGHT — is essentially identical between those frames unless the robot is in the act of passing a doorway or rounding a tight corner, events that last 300–500ms even at full speed. This means the multi-query pipeline's value is not speed: it is diversity. Spending alternate frames on scene classification, obstacle description, and path assessment at 15Hz each costs nothing in nav responsiveness (goal-tracking still gets 29Hz) while tripling the semantic richness of each nav cycle. The cycle count between query types (currently a modulus-6 rotation) has a similarly wide optimum — shifting it to modulus-4 or modulus-8 produces no measurable change in output quality. Once above the 15Hz floor per task, the system is rate-insensitive. Below it, temporal consistency breaks down and the EMA filter introduces lag that exceeds one turn's worth of motor momentum.

WiFi has two sensitivities now, not one. The cliff edge is gone from the safety path — activating the idle Hailo-8 (26 TOPS, YOLOv8n @ 430 FPS, <10ms local) gives obstacle detection a WiFi-independent route. Coral bar becomes green. The cliff survives only on the semantic path, where VLM queries on Panda still depend on the network.

The dual-process split is research-validated. IROS arXiv 2601.21506: fast local System 1 + slow remote System 2 = 66% latency reduction and 67.5% success vs 5.83% for VLM-only. Annie's Pi + Panda topology maps onto this pattern without hardware changes.

VLM frame rate above 15Hz is surprisingly insensitive. At 1m/s, frames 1/15s apart differ by 6.7cm — the robot is rarely in a different decision state. The multi-query pipeline extracts value through diversity of questions, not raw speed.

Motor speed for small turns is the second cliff edge. Speed 30 turns a 5° request into a 37° actuation. The transition from controllable to oscillating is sharp, not gradual.

Now that the WiFi cliff has been split into a safety path (mitigable via Hailo-8) and a semantic path (still WiFi-bound), which one would you harden first — and what does that choice reveal about what kind of robot you are actually building?

Click to reveal

Activate Hailo-8 first. It removes the only failure mode where a WiFi glitch can cause a physical collision, and it costs nothing in new hardware — the 26 TOPS chip is already on the Pi, waiting. After that, the remaining WiFi sensitivity (semantic queries) stops being a safety issue and becomes a latency/UX issue: Annie might pause before answering "what room is this?", but she will not hit the chair leg. The choice reveals the real architecture: Annie is a dual-process robot, not a monolithic one. System 1 (reflexes) belongs on the Pi, local and deterministic. System 2 (reasoning) belongs on Panda, remote and semantic. Fixing the WiFi channel itself (dedicated 5GHz or wired Ethernet) is still worth doing, but it becomes an optimization — not a safety prerequisite.