# Research: Per-Device Bandwidth Monitoring in Home Networks

**Created:** 2026-04-01 (Session 383)
**Status:** COMPLETE — actionable recommendations for our exact setup
**Context:** ASUS RT-AX86U Pro -> TP-Link Deco X10 -> Titan/Beast/Panda Linux servers

---

## Executive Summary

**The core problem:** Our three Linux servers (Titan, Beast, Panda) sit behind a TP-Link Deco mesh operating as a second router (double-NAT), so the ASUS router sees them as a single device ("decoMeshX10"). Per-device bandwidth from the ASUS router is blind to individual servers.

**The solution is a 3-layer approach:**

| Layer | What it monitors | Tool | Effort |
|-------|-----------------|------|--------|
| 1. Server-side | Each server's own traffic | **vnstat** daemon on each server | 10 min setup |
| 2. ASUS router | WAN total + Deco aggregate | **asusrouter** library (already validated) | Already planned |
| 3. Deco API | Per-device behind Deco (optional) | **tplinkrouterc6u** or Deco AP mode | Medium effort |

**Recommended path:** Layer 1 (vnstat) + Layer 2 (asusrouter) cover 95% of needs. Layer 3 is optional and has two sub-options: (a) switch Deco to AP mode so all devices are on the ASUS subnet, or (b) query the Deco's own API via `tplinkrouterc6u`.

---

## 1. Home Network Monitoring Tools — What Self-Hosters Actually Use

### 1.1 ntopng (The Gold Standard for Deep Visibility)

**What it is:** Web-based network traffic monitoring, GPLv3. The successor to ntop, actively maintained.

**How it works:** Captures traffic via libpcap or PF_RING on a network interface. Needs to see traffic pass through the interface it monitors — either via port mirroring (SPAN), a network TAP, or running on the gateway itself.

**Key capabilities:**
- Per-device IP, MAC, traffic breakdown, throughput, historical data
- Protocol-level DPI (Deep Packet Inspection) via nDPI
- Active Hosts widget showing all devices with traffic volume
- Docker deployment: `docker run --net=host ntop/ntopng:latest -i eth0`
- Community Edition is free, sufficient for home use

**Practical reality for our setup:** ntopng running on Titan would only see Titan's own traffic (it's not the gateway). To see all network traffic, you need either:
- Port mirroring from a managed switch (we don't have one between ASUS and Deco)
- Running ntopng on the gateway router itself (requires Merlin + Entware, rejected)
- NetFlow export from the router (ASUS stock firmware doesn't support this)

**Verdict for us:** OVERKILL. ntopng is excellent for networks where you control the gateway or have managed switches. Our setup doesn't have the right visibility point for ntopng to see all traffic. Use vnstat instead.

**Sources:**
- [ntopng self-hosting guide (XDA)](https://www.xda-developers.com/ntopng-guide/)
- [ntopng homelab setup (VirtualizationHowto)](https://www.virtualizationhowto.com/2025/11/see-everything-on-your-home-lab-network-with-ntopng/)
- [ntopng GitHub](https://github.com/ntop/ntopng)
- [Mirror/SPAN/TAP monitoring with ntopng](https://www.ntop.org/guides/ntopng/use_cases/mirror_tap_monitoring.html)

### 1.2 vnstat (Lightweight, Perfect for Server-Side)

**What it is:** Console-based network traffic monitor for Linux/BSD. Reads counters from `/proc/net/dev` (kernel-level), so zero packet sniffing overhead.

**How it works:**
- `vnstatd` daemon runs in background, samples interface counters every 5 minutes
- Stores in SQLite database (~100-200 KB per interface)
- Provides 5-minute, hourly, daily, monthly, and yearly summaries
- JSON output via `vnstat --json` for programmatic access
- Runs without root — just needs read access to `/proc/net/dev`

**Key strengths:**
- Negligible CPU/memory: reads kernel counters, doesn't sniff packets
- Per-interface tracking (eth0, wlan0, docker0, etc.)
- Built-in data retention: 48h at 5-min granularity, 4 days hourly, 2 months daily, forever yearly
- JSON output makes it trivially scriptable

**Installation:**
```bash
# Ubuntu/Debian
sudo apt install vnstat

# Start daemon
sudo systemctl enable vnstat
sudo systemctl start vnstat

# Add interface (auto-detected on most systems)
sudo vnstat -i eth0

# Query
vnstat                    # Summary
vnstat -h                 # Hourly
vnstat -d                 # Daily
vnstat --json             # Full JSON dump
vnstat --json d           # Daily in JSON
```

**Verdict for us:** ADOPT immediately. Install on Titan, Beast, and Panda. Each server tracks its own traffic with zero overhead. Annie can query `vnstat --json` via subprocess to report per-server bandwidth.

**Sources:**
- [vnstat official site](https://humdi.net/vnstat/)
- [vnstat GitHub](https://github.com/vergoh/vnstat)
- [vnstat on Ubuntu (OneUpTime)](https://oneuptime.com/blog/post/2026-03-02-use-vnstat-network-traffic-accounting-ubuntu/view)

### 1.3 Prometheus + node_exporter (For Full Observability Stack)

**What it is:** Prometheus scrapes metrics from node_exporter agents running on each server. Grafana visualizes.

**Network metrics from node_exporter:**
- `node_network_receive_bytes_total` / `node_network_transmit_bytes_total` — cumulative byte counters per interface
- `node_network_receive_packets_total` / `node_network_transmit_packets_total`
- `node_network_receive_errs_total` / `node_network_transmit_drop_total`
- Rate calculation: `rate(node_network_transmit_bytes_total[5m]) * 8` = bits/sec

**Jeff Geerling's ASUS RT-AX86U setup:** He monitors his RT-AX86U with node_exporter running on the router (via Merlin + Entware) and Prometheus scraping it. Uses internet-pi playbook for speed tests. Full Grafana dashboards.

**Verdict for us:** TOO HEAVY for just bandwidth monitoring. Prometheus + Grafana + node_exporter is a full observability stack. vnstat gives us the same bandwidth data (from the same `/proc/net/dev` source) with 1/100th the complexity. Consider only if we later want CPU/RAM/disk monitoring across all servers — but that's a separate project.

**Sources:**
- [Jeff Geerling's ASUS RT-AX86U monitoring](https://www.jeffgeerling.com/blog/2022/monitoring-my-asus-rt-ax86u-router-prometheus-and-grafana/)
- [node_exporter network metrics](https://www.robustperception.io/network-interface-metrics-from-the-node-exporter/)
- [Homelab monitoring with Prometheus](https://excalibursheath.com/guide/2025/10/12/visibility-homelab-monitoring-logging.html)
- [internet-pi (GitHub)](https://github.com/geerlingguy/internet-pi)

### 1.4 Pi-hole + DNS Monitoring

**What it is:** DNS sinkhole that blocks ads. Side benefit: logs every DNS query per device.

**Prometheus integration:** `pihole-exporter` (GitHub: eko/pihole-exporter) exports metrics to Prometheus — total queries, blocked queries, query types, per-client stats.

**What it DOESN'T do:** DNS queries don't tell you bandwidth. A device making 1 DNS query could be streaming 10 GB or loading a 1 KB page. Pi-hole gives you "what devices are talking to" not "how much data they're moving."

**Verdict for us:** IRRELEVANT to bandwidth monitoring. Useful for ad blocking and DNS visibility, but doesn't solve the per-device bandwidth problem.

**Sources:**
- [Pi-hole monitoring with Prometheus (Sysdig)](https://www.sysdig.com/blog/monitoring-pihole-prometheus)
- [pihole-exporter GitHub](https://github.com/eko/pihole-exporter)

### 1.5 darkstat / bandwidthd / nload

**darkstat:** Lightweight web-based traffic analyzer. Captures packets on an interface, shows per-host traffic. Last meaningful update: 2014. Still works but no active development.

**bandwidthd:** Creates HTML graphs of per-IP bandwidth usage by sniffing traffic. Last update: 2013. Abandoned.

**nload:** Real-time console bandwidth display for a single interface. Shows incoming/outgoing in a nice ncurses graph. No per-device breakdown, no historical data.

**Verdict for us:** ALL SKIP. Abandoned projects or wrong tool for the job. vnstat is strictly better for our use case.

**Sources:**
- [darkstat alternatives (AlternativeTo)](https://alternativeto.net/software/darkstat/)
- [bandwidthd alternatives (SaaSHub)](https://www.saashub.com/bandwidthd-alternatives)

### 1.6 nethogs (Per-Process Bandwidth)

**What it is:** Shows which Linux process is using how much bandwidth in real-time. Groups traffic by PID, not by IP/device.

**How it works:** Uses raw socket capture + /proc to map connections to processes. Requires root or CAP_NET_ADMIN.

**Example output:**
```
PID   USER     PROGRAM                     DEV        SENT      RECEIVED
1234  rajesh   python3 (annie-voice)       eth0       2.3 KB/s  45.2 KB/s
5678  root     ollama                      eth0      12.1 KB/s   8.4 KB/s
```

**Verdict for us:** USEFUL for debugging but not for monitoring. When Annie reports "Titan used 50 GB today," nethogs can tell you "30 GB was Ollama model downloads." Install on demand, don't run as a daemon.

**Sources:**
- [nethogs guide (TecMint)](https://www.tecmint.com/nethogs-monitor-per-process-network-bandwidth-usage-in-real-time/)
- [nethogs per-process usage (OneUpTime)](https://oneuptime.com/blog/post/2026-03-02-how-to-use-nethogs-for-per-process-network-usage-on-ubuntu/view)

---

## 2. ASUS Router Specific — Community Approaches

### 2.1 Home Assistant AsusRouter Integration (ha-asusrouter)

**What it exposes:**
- Connected devices (MAC, IP, name, connection type, RSSI, online status)
- CPU usage (per-core)
- RAM usage
- Temperature
- WAN status
- Network throughput (aggregate, not per-device)
- LED/Aura control

**Per-device bandwidth:** NO. The integration uses the same `asusrouter` library we're using, which provides `AsusData.NETWORK` (aggregate throughput) and `AsusData.CLIENTS` (device list without bandwidth). The HA integration does NOT add per-device bandwidth on top.

**Source:** [ha-asusrouter GitHub](https://github.com/Vaskivskyi/ha-asusrouter)

### 2.2 Traffic Analyzer Internal Storage (bwdpi)

**Where data lives on the router:**
- `/jffs/.sys/TrafficAnalyzer/TrafficAnalyzer.db` — SQLite database with per-device, per-app traffic data
- `/tmp/bwdpi/` — runtime files: `bwdpi.bndwth.db`, `bwdpi.app.db`, `bwdpi.cat.db`, `bwdpi.devdb.db`
- Database size limit: 14 MB (JFFS partition constraint)
- Data granularity: hourly buckets
- Retention: limited by JFFS space, typically a few weeks to a month

**Data format:** The Traffic Analyzer stores per-MAC bandwidth with app-level classification (streaming, web, gaming, etc.) using the Trend Micro DPI engine. The database schema includes `traffic` table with MAC address, service name, and byte counts.

**Access methods:**
1. **Web GUI scraping** (what we're doing): GET `/TrafficAnalyzer_Statistic.asp` and parse JavaScript variables (`top5_client_array`, `total_clients_array`, etc.)
2. **SSH + sqlite3** (requires Merlin): `sqlite3 /jffs/.sys/TrafficAnalyzer/TrafficAnalyzer.db "SELECT * FROM traffic"`
3. **bwdpi files** (requires SSH): Read `/tmp/bwdpi/bwdpi.bndwth.db` directly

**Community projects for extracting this data:**

| Project | GitHub | Method | Notes |
|---------|--------|--------|-------|
| AsusTrafficData | [henley-regatta/AsusTrafficData](https://github.com/henley-regatta/AsusTrafficData) | Reads JFFS files, exports to InfluxDB | Requires SSH/Merlin. Cron every hour. Grafana visualization. |
| extstats | [corgan2222/extstats](https://github.com/corgan2222/extstats) | Shell scripts on router, export to InfluxDB | Merlin + Entware. Per-client WiFi info, traffic analysis, app classification. |
| AsusRouterMonitor | [lmeulen/AsusRouterMonitor](https://github.com/lmeulen/AsusRouterMonitor) | Python, HTTP API calls | Blog post on ITNEXT. Uses CGI endpoints remotely. |
| asus-traffic-monitor | [SIZMW/asus-traffic-monitor-data-retriever](https://github.com/SIZMW/asus-traffic-monitor-data-retriever) | Retrieves daily traffic data | Exports to Excel. |
| bwmon | [VREMSoftwareDevelopment/bwmon](https://github.com/VREMSoftwareDevelopment/bwmon) | Bandwidth Monitor for AsusWRT-Merlin | Merlin-only. |
| grafana-router-asus | [pheetr/grafana-router-asus](https://github.com/pheetr/grafana-router-asus) | ASUS metrics in Grafana | Dashboard templates. |

**Key insight from the guided-naafi.org blog:** The author schedules data extraction via cron to run once an hour because "TrafficAnalyzer only updates the on-disk database at this interval." This confirms the hourly granularity we discovered in Session 383.

**Sources:**
- [AsusTrafficData blog post](https://www.guided-naafi.org/systemsmanagement/2021/06/14/WhyUseAsusWhenYouCanWriteYourOwn.html)
- [SNBForums: Enhanced traffic capture](https://www.snbforums.com/threads/enhanced-tools-to-capture-per-device-asus-traffic-analyzer-statistics-longer-term.91003/)
- [SNBForums: Traffic Analyzer data storage](https://www.snbforums.com/threads/how-do-you-save-traffic-analyzer-data-on-asus-rt-ac68u.46399/)
- [Merlin bwdpi_sqlite.h source](https://github.com/RMerl/asuswrt-merlin.ng/blob/master/release/src/router/bwdpi_source/asus_include/bwdpi_sqlite.h)

### 2.3 Merlin Firmware Users — What They Use

**Common Merlin monitoring stack:**
1. **Entware** (package manager on USB stick) provides apt-like package installation
2. **vnstat-on-Merlin** ([de-vnull/vnstat-on-merlin](https://github.com/de-vnull/vnstat-on-merlin)): Tracks per-interface traffic with UI, CLI, and email reports. Requires Diversion + Entware.
3. **SNMP** on the router: Provides some OIDs (load, CPU time) but NOT per-device bandwidth. SNMP shows total traffic per ethernet adapter — individual devices behind that adapter are invisible.
4. **YAMon** ([usage-monitoring.com](https://usage-monitoring.com/)): Per-user, per-device monitoring using iptables byte counters. Works on DD-WRT, OpenWRT, and AsusWRT. Creates iptables rules per IP address to track bandwidth.
5. **node_exporter** on the router (Jeff Geerling style): Exposes system metrics to Prometheus.

**YAMon is the most relevant for per-device:** It creates iptables rules for each IP on the network and reads the byte counters. This is the classic "router as accounting point" approach. But it requires running software ON the router (Merlin + Entware), which Rajesh rejected.

**SNMP limitation confirmed:** Even with SNMP enabled on the router, it shows total network traffic per ethernet adapter. Individual devices connected to that adapter cannot be seen through SNMP. Per-device requires either DPI (Traffic Analyzer) or iptables rules (YAMon).

**Sources:**
- [vnstat-on-Merlin (SNBForums)](https://www.snbforums.com/threads/release-vnstat-on-merlin-ui-cli-and-email-data-use-and-data-limit-monitoring-r1-and-r2.71523/)
- [SNMP monitoring on Merlin (Home Assistant Community)](https://community.home-assistant.io/t/advanced-snmp-monitoring-part-one-asuswrt-routers-merlin-build/68984)
- [YAMon (usage-monitoring.com)](https://usage-monitoring.com/)
- [YAMon on OpenWrt Forum](https://forum.openwrt.org/t/yamon-per-user-per-device-usage-monitoring/7866)

### 2.4 Stock Firmware Users — What's Actually Available

**Without Merlin, stock firmware users have these options:**

1. **Traffic Analyzer GUI** — built-in, but limited to top 5 devices, short retention, no export
2. **Web scraping** `/TrafficAnalyzer_Statistic.asp` — what we're doing. Parse JavaScript variables. Works remotely without SSH.
3. **`asusrouter` library** — for system metrics (CPU, RAM, temp, WAN, clients). Does NOT provide per-device bandwidth.
4. **`appGet.cgi` hooks** — `get_clientlist()` gives device list with WiFi link speed (curTx/curRx, NOT actual throughput). `netdev()` gives aggregate byte counters (BRIDGE, INTERNET, WIRED, WIRELESS).

**There is no clean JSON API for per-device bandwidth on stock ASUS firmware.** The only source is the Traffic Analyzer ASP page (HTML scraping) or the bwdpi database (requires SSH).

---

## 3. Double-NAT / Mesh Visibility Problem

### 3.1 The Core Problem

```
ASUS RT-AX86U Pro (192.168.50.x)
  └── TP-Link Deco X10 (192.168.50.160) ← ASUS sees this as ONE device
        ├── Titan  (192.168.68.52)  ← invisible to ASUS
        ├── Beast  (192.168.68.58)  ← invisible to ASUS
        ├── Panda  (192.168.68.57)  ← invisible to ASUS
        └── Laptop (192.168.68.56)  ← invisible to ASUS
```

The ASUS router's Traffic Analyzer shows bandwidth for "decoMeshX10" as a single aggregate. It cannot distinguish Titan from Beast from Panda.

### 3.2 Solution Options

#### Option A: Switch Deco to Access Point Mode (BEST)

**How it works:** In AP mode, the Deco disables its DHCP server, NAT, and routing. It becomes a pure WiFi access point / wired switch. All devices get IP addresses from the ASUS router's DHCP server on the 192.168.50.x subnet.

**Result:**
```
ASUS RT-AX86U Pro (192.168.50.x, DHCP server)
  └── TP-Link Deco X10 (AP mode, bridge)
        ├── Titan  (192.168.50.52)  ← NOW visible to ASUS
        ├── Beast  (192.168.50.58)  ← NOW visible to ASUS
        ├── Panda  (192.168.50.57)  ← NOW visible to ASUS
        └── Laptop (192.168.50.56)  ← NOW visible to ASUS
```

**Advantages:**
- All devices on one subnet — ASUS Traffic Analyzer sees everything individually
- No double-NAT — simpler networking
- Deco still provides WiFi mesh and wired switching
- `asusrouter` library sees all clients in `get_clientlist()`

**Disadvantages:**
- Deco loses some "gateway" features (HomeShield, Monthly Report in Deco app)
- Deco app has reduced functionality
- All IP addresses change (need to update .env files, SSH configs, etc.)
- Need to reconfigure static IPs or DHCP reservations on ASUS

**How to switch:** Deco app -> Settings -> Advanced -> Working Mode -> Access Point Mode. Select "Dynamic IP" for connection type.

**Verdict:** THIS IS THE CLEANEST SOLUTION. Eliminates double-NAT and gives ASUS full per-device visibility. The IP address change is a one-time migration cost.

**Sources:**
- [TP-Link: Set up Deco in AP mode](https://www.tp-link.com/us/support/faq/1842/)
- [TP-Link: Avoid Double NAT](https://www.tp-link.com/us/support/faq/3113/)
- [AP mode vs Router mode](https://www.tp-link.com/us/support/faq/2399/)

#### Option B: Query the Deco's Own API

**Python libraries available:**

1. **tplinkrouterc6u** ([PyPI](https://pypi.org/project/tplinkrouterc6u/)):
   - Supports `TPLinkDecoClient` for Deco devices
   - Can retrieve client list with connection info
   - Per-device bandwidth fields: unclear from documentation, needs testing

2. **ha-tplink-deco** ([GitHub](https://github.com/amosyuen/ha-tplink-deco)):
   - Home Assistant custom component
   - Tracks `down_kilobytes_per_s` and `up_kilobytes_per_s` per connected client
   - Polls Deco admin web UI locally
   - Device trackers for each Deco unit and each connected client
   - These are instantaneous rates, not cumulative — you'd need to sample and integrate

3. **deco (Go)** ([GitHub](https://github.com/MrMarble/deco)):
   - Go wrapper for TP-Link M4 API
   - Provides `ClientList()` with `DownSpeed`, `UpSpeed` fields
   - Performance metrics (CPU, memory)
   - Not Python, but demonstrates the API structure

4. **Deco API Gist** ([GitHub Gist](https://gist.github.com/rosmo/29200c1aedb991ce55942c4ae8b54edd)):
   - TP-Link X90 Deco API example
   - Shows JSON-RPC protocol used by Deco devices
   - Autokey XOR "encryption" on the API protocol

**TP-Link official position:** "There is no public API available for Deco." Everything above is reverse-engineered.

**Key concern:** The `down_kilobytes_per_s` / `up_kilobytes_per_s` from Deco are INSTANTANEOUS rates, not cumulative counters. To get total bandwidth, you must sample frequently (e.g., every 30s) and multiply rate * interval. This introduces significant measurement error for bursty traffic.

**Verdict:** POSSIBLE but fragile. Reverse-engineered APIs break across firmware updates. Instantaneous rate sampling is less accurate than cumulative counters. Prefer Option A if feasible.

**Sources:**
- [tplinkrouterc6u (PyPI)](https://pypi.org/project/tplinkrouterc6u/)
- [ha-tplink-deco (GitHub)](https://github.com/amosyuen/ha-tplink-deco)
- [MrMarble/deco Go wrapper](https://github.com/MrMarble/deco)
- [TP-Link X90 API Gist](https://gist.github.com/rosmo/29200c1aedb991ce55942c4ae8b54edd)

#### Option C: Port Mirroring / Network TAP

**Port mirroring (SPAN):** Configure a managed switch to copy all traffic from one port to another where a monitoring server is connected. The ASUS RT-AX86U Pro does NOT support port mirroring on its LAN ports (consumer router, not a managed switch).

**Network TAP:** Physical device that sits inline on an Ethernet cable and passively copies traffic to a monitoring port. Costs $50-200 for a Gigabit TAP.

**Practical for us?** NO. We'd need to place a TAP between the ASUS LAN port and the Deco WAN port, then run a cable from the TAP to a server running ntopng. This is enterprise-grade overkill for a home setup with 3 known Linux servers.

**Sources:**
- [TAP vs SPAN (Gigamon)](https://www.gigamon.com/resources/resource-library/white-paper/to-tap-or-to-span.html)
- [Port mirror vs TAP (ntop)](https://www.ntop.org/pf_ring/port-mirror-vs-network-tap/)

#### Option D: Wire Servers Directly to ASUS LAN Ports

**How it works:** Run Ethernet cables from Titan, Beast, and Panda directly to ASUS LAN ports instead of through the Deco. Keep Deco for WiFi devices only.

**Result:**
- ASUS sees servers as individual wired clients
- Traffic Analyzer shows per-server bandwidth
- Servers on 192.168.50.x subnet, WiFi devices on 192.168.68.x (if Deco stays in router mode) or 192.168.50.x (if Deco in AP mode)

**Practical for us?** DEPENDS ON PHYSICAL LAYOUT. The ASUS router has 4 Gigabit LAN ports. If all 3 servers can be wired to it (long Ethernet runs may be needed), this works. But it defeats the purpose of having a Deco in the first place (centralized switch near the servers).

### 3.3 MAC Randomization

Modern devices (iOS 14+, Android 10+) randomize their WiFi MAC addresses by default. Each network gets a different MAC. This means:
- A device may appear as a "new unknown device" if it changes MAC
- Historical tracking breaks if MAC changes
- Mitigation: Disable MAC randomization on known personal devices, or track by IP + hostname

**For our servers:** Not an issue. Titan/Beast/Panda have fixed Ethernet MACs. Only mobile devices (phones, laptops) are affected.

---

## 4. Linux Server-Side Monitoring (Layer 1)

### 4.1 `/proc/net/dev` — The Foundation

All Linux network monitoring tools ultimately read from `/proc/net/dev`, which provides cumulative byte and packet counters per network interface:

```
Inter-|   Receive                                  |  Transmit
 face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo frame compressed
  eth0: 12345678  98765  0    0    0     0          0       0  87654321  76543  0    0    0     0          0
```

**Key properties:**
- Updated by the kernel in real-time
- Cumulative counters (wrap at 2^64 on 64-bit systems)
- Per-interface only, not per-IP or per-process
- Zero overhead to read

### 4.2 vnstat — Recommended Approach

**Why vnstat is the best fit:**
- Reads `/proc/net/dev` counters, no packet sniffing
- Daemon runs continuously, handles counter wraps and reboots
- 5-minute, hourly, daily, monthly, yearly granularity
- JSON output for programmatic access
- Tiny footprint (~1 MB binary, ~200 KB database)

**Deployment plan for our servers:**

```bash
# On each server (Titan, Beast, Panda):
sudo apt install vnstat
sudo systemctl enable vnstat
sudo systemctl start vnstat

# Verify interface is being tracked:
vnstat --iflist
# Should show eth0 (or enp1s0 or similar)

# After 5 minutes, verify data collection:
vnstat -i eth0
```

**Annie integration:**
```python
import subprocess, json

def get_server_bandwidth(server_name: str) -> dict:
    """Get bandwidth stats from a server via SSH + vnstat."""
    result = subprocess.run(
        ["ssh", server_name, "vnstat", "--json", "d"],
        capture_output=True, text=True, timeout=10
    )
    return json.loads(result.stdout)
```

**For Titan (where Annie runs), no SSH needed:**
```python
result = subprocess.run(["vnstat", "--json", "d"], capture_output=True, text=True)
data = json.loads(result.stdout)
# data["interfaces"][0]["traffic"]["day"] contains daily rx/tx bytes
```

### 4.3 nftables/iptables Byte Counters

**How it works:** Create firewall rules that match traffic and count bytes. Each rule has a packet counter and byte counter.

```bash
# nftables example: count traffic per destination IP
nft add table inet accounting
nft add chain inet accounting forward { type filter hook forward priority 0 \; }
nft add rule inet accounting forward ip daddr 192.168.68.52 counter  # Titan
nft add rule inet accounting forward ip daddr 192.168.68.58 counter  # Beast
nft add rule inet accounting forward ip daddr 192.168.68.57 counter  # Panda

# Read counters:
nft list chain inet accounting forward
```

**Advantage over vnstat:** Can track per-IP (not just per-interface). Useful if multiple services share an interface.

**Disadvantage:** Must manage rules manually. Counters reset on reboot unless saved. vnstat handles all this automatically.

**Verdict for us:** UNNECESSARY. Since each server has its own Ethernet interface, vnstat per-interface = per-server. No need for per-IP accounting.

**Sources:**
- [nftables counters wiki](https://wiki.nftables.org/wiki-nftables/index.php/Counters)
- [iptables bandwidth monitoring](https://www.cyberciti.biz/tips/monitor-bandwidth-with-iptables.html)
- [Traffic accounting with iptables](https://catonmat.net/traffic-accounting-with-iptables)

### 4.4 nethogs — Per-Process (On Demand)

**Use case:** When Annie reports "Titan used 80 GB today" and you want to know WHY.

```bash
sudo nethogs eth0
# Shows: PID, USER, PROGRAM, DEV, SENT, RECEIVED in real-time
```

**Not a daemon — run on demand only.** If you want persistent per-process tracking, use `picosnitch` or `ss` + periodic sampling.

### 4.5 Prometheus + node_exporter

**If we ever deploy Prometheus** (currently not planned), the relevant metrics are:
- `node_network_receive_bytes_total{device="eth0"}`
- `node_network_transmit_bytes_total{device="eth0"}`

These come from the same `/proc/net/dev` source as vnstat. The only advantage is integration with Grafana dashboards and alerting. For our Annie-centric architecture, vnstat JSON output is simpler and sufficient.

---

## 5. Best Practices for Our Exact Setup

### 5.1 What People with Similar Setups Actually Do

**Jeff Geerling (homelab influencer):** ASUS RT-AX86U + node_exporter on Merlin + Prometheus + Grafana + internet-pi for speed tests. Full observability stack. Uses managed switches with SNMP for per-device visibility.

**SNBForums community (ASUS power users):** Merlin firmware + YAMon for per-device iptables accounting, or extstats for InfluxDB export. Some use vnstat-on-Merlin for interface-level tracking.

**Home Assistant community:** `ha-asusrouter` for device tracking + system metrics. `ha-tplink-deco` for Deco device tracking with instantaneous bandwidth rates.

**Reddit r/homelab consensus (2024-2025):**
- For per-device: "Put your monitoring at the gateway" or "use a managed switch with port mirroring"
- For per-server: "vnstat on each box, it's dead simple"
- For whole-network: "ntopng in Docker on a box that can see the traffic"
- For budget: "Don't overthink it. vnstat + a cron script is 90% of what you need."

### 5.2 Recommended Architecture for Our Setup

```
Layer 1: Server-Side (vnstat)
├── Titan:  vnstat daemon → /var/lib/vnstat/eth0 → vnstat --json
├── Beast:  vnstat daemon → /var/lib/vnstat/eth0 → vnstat --json
└── Panda:  vnstat daemon → /var/lib/vnstat/eth0 → vnstat --json

Layer 2: ASUS Router (asusrouter library, already planned)
├── WAN total bandwidth (netdev counters, sampled every 5 min)
├── Client list (5 devices on ASUS subnet)
├── System metrics (CPU, RAM, temp)
├── Traffic Analyzer scraping (per-device on ASUS subnet, hourly)
└── Speed tests (Ookla or speedtest-cli, every 2 hours)

Layer 3: Deco Visibility (choose one)
├── Option A: Switch Deco to AP mode (RECOMMENDED)
│   └── All devices on 192.168.50.x → ASUS sees everything
└── Option B: Query Deco API (ALTERNATIVE)
    └── tplinkrouterc6u or ha-tplink-deco → per-device instantaneous rates
```

### 5.3 Implementation Priority

| Priority | Action | Effort | Impact |
|----------|--------|--------|--------|
| P0 | Install vnstat on Titan, Beast, Panda | 10 min | Per-server daily/monthly bandwidth |
| P1 | Build router_collector.py (already planned, Session 382) | 2-3 hours | WAN monitoring, speed tests, alerts |
| P2 | Add `vnstat --json` queries to Annie's `network_status` tool | 30 min | "How much bandwidth did Titan use today?" |
| P3 | Switch Deco to AP mode OR test Deco API | 1-2 hours | Per-device visibility for servers |
| P4 | Add Traffic Analyzer scraping to collector | 1-2 hours | Per-device on ASUS subnet (hourly) |

---

## 6. TP-Link Deco API — Deep Dive

### 6.1 Available Python Libraries

| Library | Target | Per-Device BW | Auth | Notes |
|---------|--------|---------------|------|-------|
| `tplinkrouterc6u` | Multiple TP-Link routers + Deco | Unclear (needs testing) | Web admin login | Active development, HA integration |
| `ha-tplink-deco` | Deco only (HA component) | Yes (`down_kilobytes_per_s`, `up_kilobytes_per_s`) | Web admin login | Polls local web UI |
| `deco` (Go) | Deco M4 | Yes (`DownSpeed`, `UpSpeed`) | Web admin login | Go only, not Python |
| `tmpcli` | TP-Link Tether Protocol | Unknown | Tether Protocol | Reverse-engineered, experimental |

### 6.2 Deco API Protocol (Reverse Engineered)

Based on the Go wrapper and the X90 API gist, Deco devices use:
- **Transport:** HTTPS to the Deco's local IP (e.g., 192.168.68.1)
- **Protocol:** JSON-RPC style requests
- **Encryption:** Autokey XOR cipher on the payload (easily reversible)
- **Authentication:** Admin password, session-based

**Available data from Deco API:**
- Client list with MAC, IP, name, connection type
- Per-client download/upload speed (instantaneous KB/s)
- Deco device list (mesh nodes)
- CPU and memory usage of Deco units
- WAN information

### 6.3 AP Mode Impact on Deco API

**When Deco is in AP mode:**
- Gateway features disappear from the Deco app
- HomeShield and Monthly Report features are disabled
- Client tracking may still work (Deco still knows which devices connect to it)
- Per-device bandwidth tracking in the Deco API: UNCLEAR — needs testing

**Important:** If we switch Deco to AP mode, we may LOSE the Deco API's per-device bandwidth data. But we GAIN per-device visibility from the ASUS router instead. This is a net win because the ASUS data (Traffic Analyzer, hourly totals) is more reliable than Deco's instantaneous rate sampling.

### 6.4 Verdict on Deco API

**If Deco stays in Router mode:** Use `tplinkrouterc6u` to query per-device bandwidth from the Deco. Sample every 30-60 seconds, integrate rates over time. Accept ~10-20% measurement error from sampling.

**If Deco switches to AP mode:** Don't bother with Deco API. ASUS sees everything. Simpler, more accurate.

---

## 7. Consolidated Recommendations

### For immediate implementation (this week):

1. **Install vnstat on all 3 servers** — 10 minutes total. Gives per-server daily/monthly bandwidth with zero ongoing effort.

2. **Continue with router_collector.py as planned** — ASUS monitoring via `asusrouter` library for system metrics and aggregate bandwidth.

3. **Add vnstat JSON queries to Annie's network_status tool** — "How much data did Titan use this month?" becomes trivially answerable.

### For medium-term (when convenient):

4. **Switch Deco to AP mode** — Eliminates double-NAT, gives ASUS full per-device visibility. One-time migration: update all static IPs from 192.168.68.x to 192.168.50.x, update .env files and SSH configs.

5. **Add Traffic Analyzer scraping** — Parse `/TrafficAnalyzer_Statistic.asp` for per-device bandwidth data (hourly granularity).

### What NOT to do:

- Do NOT install Prometheus + Grafana + node_exporter just for bandwidth monitoring. vnstat gives the same data at 1/100th the complexity.
- Do NOT install ntopng. It can only see traffic on the interface it monitors — without port mirroring or being the gateway, it adds no value.
- Do NOT flash Merlin firmware. Stock firmware + external monitoring achieves the same goal.
- Do NOT rely on Deco's reverse-engineered API for production monitoring. It's fragile and provides only instantaneous rates.

---

## 8. Open Questions

1. **Deco AP mode and server connectivity:** Will switching Deco to AP mode cause any disruption to AiMesh (R15 node on ASUS)? Need to test.
2. **ASUS DHCP range:** Does the ASUS have enough DHCP range for all devices currently on 192.168.68.x? Need to check DHCP pool settings.
3. **Deco AP mode and its API:** Does `tplinkrouterc6u` still work when Deco is in AP mode? Needs testing.
4. **vnstat on DGX Spark:** Are there any issues with vnstat on the NVIDIA DGX Spark (aarch64)? Likely fine since it just reads `/proc/net/dev`.
