# Research — Adding a Second RTX GPU to Panda

**Date:** 2026-04-14 (session 101)
**Status:** Research complete — blocked on PSU wattage (requires physical check)
**Driver:** User wants to know the highest-end RTX card that can be added to Panda alongside the existing RTX 5070 Ti.

---

## TL;DR

**Uncomfortable answer: the current motherboard (MSI MAG X870 GAMING PLUS WIFI / MS-7E47) is a single-GPU board.** Per SMBIOS slot inventory, there's exactly one x16-physical PCIe slot, already occupied. The only x4-sized chipset slots physically cannot fit a full-length GPU.

**Three paths forward:**
1. **Motherboard swap** → workstation-class board with multiple x16 slots (the clean path)
2. **PCIe riser + open-frame chassis** → keep current board, externalize second GPU (hacky)
3. **M.2-to-PCIe adapter** → use spare M.2 slot, limited to PCIe 5.0 x4 (bandwidth-compromised)

**Before any path**, physically inspect the PSU sticker to record wattage — that's the binding constraint on card selection.

---

## Verified hardware facts (2026-04-14)

From `/home/rajesh/hardware-inventory/report-20260414_135759.txt` (generated via `~/hardware-inventory/collect.sh`):

- **Motherboard:** MSI MAG X870 GAMING PLUS WIFI (MS-7E47) rev 2.0
- **CPU:** Ryzen 9 9900X3D (TDP 120 W, boost to ~140 W under all-core load)
- **Current GPU:** MSI RTX 5070 Ti (300 W cap — locked, cannot exceed)
- **RAM:** 64 GB DDR5 4800 MT/s (2 of 4 slots)
- **NVMe:** WD_BLACK SN8100 2TB (Gen 5 x4)
- **PSU wattage:** **UNKNOWN — dmidecode type 39 empty, sysfs no entries. Requires physical inspection.**
- **Case:** MSI Desktop (exact model not in SMBIOS)

### PCIe slot reality (the wall)

| Slot | Size | Electrical | Status |
|------|------|-----------|--------|
| PCIE1 | x16 | PCIe 5.0 x16 (CPU) | **OCCUPIED** — RTX 5070 Ti |
| J3502 | M.2 | PCIe 5.0 x4 (CPU) | NVMe |
| PCIE3 | **x4** | PCIe x4 | "In use" per SMBIOS — too short for GPU |
| PCIE4 | **x4** | PCIe x4 | "In use" per SMBIOS — too short for GPU |

There is **no second full-size slot**. This is the binding physical constraint.

---

## Three paths, with honest tradeoffs

### Path 1: Motherboard swap (the clean path)

Replace MS-7E47 with a board that has at least two x16-sized slots. Good candidates on AM5 / Zen 5:

| Board | PCIe layout | Price range | Notes |
|-------|-------------|-------------|-------|
| ASUS Pro WS X870E-ACE | 3× x16 (PCIe 5.0 x8 + 4.0 x8 + 4.0 x4) | $500-600 | Workstation-grade, 10 GbE |
| ASRock X870E Taichi | 2× x16-size (5.0 x16 + 4.0 x4) | $450-550 | Consumer flagship |
| Gigabyte X870E Aorus Master | 2× x16-size | $450-550 | Consumer flagship |
| MSI MEG X870E Godlike | 2× x16-size | $800+ | Overkill for this use |

**Cost:** $450-800 board + labor to rebuild. RAM, CPU, NVMe, PSU, case all reusable. GPU1 moves across unchanged.

**Constraint:** Even on a dual-x16 board, the second slot typically runs at PCIe 5.0 x8 (CPU bifurcation) OR PCIe 4.0 x4 (chipset). Both are fine for inference — negligible perf loss for LLM serving.

### Path 2: PCIe riser + open-frame chassis

Keep the current board. Run a **PCIe 5.0 x16 riser cable** (e.g., LinkUp Gen 5) from PCIE1 to an external GPU mount. Relocate the RTX 5070 Ti to the external mount, install the new (larger) GPU in PCIE1 directly.

**Reality check:** PCIe 5.0 riser cables are a real pain. Signal integrity at 32 GT/s over flex cable is marginal; most mining-style risers are Gen 4 or Gen 3 at best. If you run Gen 5 signalling through an inadequate riser, the link auto-downgrades to Gen 3 (or worse, flickers unstable).

**Cost:** $80-200 riser cable (Gen 5 rated) + ~$100 open-frame chassis + hours of fiddly cable management.

**Not recommended** unless you want to live with the machine looking like a crypto-miner rig.

### Path 3: M.2-to-PCIe adapter

Use one of the spare M.2 slots (the board has 4 M.2 slots total; you're using 1 for NVMe). An M.2 Key-M socket gives you 4 PCIe lanes. An M.2-to-PCIe x16 adapter (ADT-LINK, PCE3-4x, etc.) physically converts the M.2 socket into a PCIe x16-sized slot with x4 electrical bandwidth.

| Aspect | Value |
|--------|-------|
| Electrical bandwidth | PCIe 5.0 x4 (~64 GB/s) — equivalent to PCIe 4.0 x8 or PCIe 3.0 x16 |
| Physical slot generated | x16-size (fits any GPU mechanically) |
| Perf impact for LLM inference | **Negligible** — LLM inference is memory-bandwidth-bound within the GPU, not PCIe-bandwidth-bound |
| Perf impact for training | Real — gradient all-reduce hammered by PCIe x4 |
| GPU power | Still needs PSU 8-pin/16-pin cable direct (adapter doesn't provide power) |

**For her-os nav VLM workloads**, PCIe x4 is fine. The VLM inference cycle is:
1. Upload image base64 (~80 KB via HTTP, ~1 ms over network)
2. GPU compute (~18 ms)
3. Return single action word (~bytes, negligible)

PCIe bandwidth never becomes the bottleneck.

**Cost:** $30-80 for adapter + cable. **No case mod required** if the GPU fits by laying sideways into the case or spilling out the back.

---

## PSU math (the second binding constraint)

Current system draw (worst-case steady-state):

| Component | Typical | Peak |
|-----------|--------:|-----:|
| Ryzen 9 9900X3D | 120 W | 160 W (all-core + PBO) |
| RTX 5070 Ti | 50 W idle | **300 W** (locked cap) |
| 2× DDR5 DIMMs | 10 W | 15 W |
| NVMe SN8100 | 8 W | 11 W |
| Motherboard + chipset + Wi-Fi + fans | 30 W | 50 W |
| **Current total** | ~220 W | **~540 W** |

So a 650W PSU would run the current system fine, 750W comfortably, 850W+ with headroom.

### Adding a second GPU — PSU budget table

| Second GPU | GPU TDP | New peak | Minimum PSU | Recommended PSU |
|------------|--------:|---------:|------------:|----------------:|
| RTX 5060 (8 GB) | 150 W | 690 W | 750 W | 850 W |
| RTX 5060 Ti (16 GB) | 180 W | 720 W | 850 W | 1000 W |
| RTX 5070 (12 GB) | 220 W | 760 W | 850 W | 1000 W |
| **RTX 5070 Ti (16 GB) — matched pair** | 300 W | 840 W | 1000 W | 1200 W |
| RTX 5080 (16 GB) | 360 W | 900 W | 1000 W | 1200 W |
| RTX 5090 (32 GB) | **575 W** | 1115 W | **1300 W** | 1500 W |

**The 5090 has a 575 W TDP — on a typical 850 W consumer PSU, adding one to a 540 W baseline puts you at 1115 W, well over the safe threshold.** Even a 1000 W PSU can't do it. Only ATX 3.1 PSUs rated 1300 W+ with the 12V-2x6 connector can safely run a 5090.

**Before recommending a card, you MUST physically check the PSU sticker on Panda** — look at the side panel / floor of the case. The wattage (e.g., "750 W", "850 W", "1000 W") will be printed in large type. Also check for the ATX 3.1 / 12VHPWR / 12V-2x6 label — Gen 5 GPUs need the updated connector.

---

## RTX 50-series by VRAM — April 2026 market state

**Critical context:** RTX 5080 SUPER / 5070 Ti SUPER / 5070 SUPER refresh **cancelled or indefinitely delayed** as of April 2026. 3 GB GDDR7 modules needed for those cards are being 100% consumed by datacenter/AI customers. **No 24 GB consumer card is coming in 2026 or 2027.** Next consumer refresh expected RTX 60-series 2028.

**Street prices are dramatically above MSRP** due to AI-driven GPU shortage.

### Consumer GeForce RTX 50 (available now)

| Card | VRAM | Mem BW | TDP | MSRP | Street Apr 2026 |
|------|-----:|-------:|----:|-----:|-----------------:|
| RTX 5050 | 8 GB GDDR6 | 320 GB/s | 130 W | $249 | ~$280-320 |
| RTX 5060 | 8 GB GDDR7 | 448 GB/s | 145 W | $299 | ~$320-370 |
| RTX 5060 Ti 8 GB | 8 GB GDDR7 | 448 GB/s | 180 W | $379 | ~$400-450 |
| **RTX 5060 Ti 16 GB** | **16 GB GDDR7** | 448 GB/s | 180 W | $429 | **~$450-500** |
| RTX 5070 | 12 GB GDDR7 | 672 GB/s | 250 W | $549 | ~$600-680 |
| **RTX 5070 Ti** (current) | **16 GB GDDR7** | 896 GB/s | 300 W | $749 | **~$830-890** |
| **RTX 5080** | **16 GB GDDR7** | 960 GB/s | 360 W | $999 | **~$1,200-1,400** |
| **RTX 5090** | **32 GB GDDR7** | 1,792 GB/s | 575 W | $1,999 | **~$3,500-5,000** ⚠️ |

### Workstation RTX Pro Blackwell (different tier — ECC VRAM, blower coolers)

| Card | VRAM | TDP | Street price |
|------|-----:|----:|-------------:|
| RTX Pro 4500 Blackwell | 24 GB | 200 W | ~$1,800 |
| RTX Pro 5000 Blackwell | 48 GB | 300 W | ~$4,500 |
| RTX Pro 6000 Blackwell | **96 GB** | 600 W | ~$10,000 |

### Previous-generation workstation cards (still highly relevant — used market)

| Card | VRAM | Mem BW | TDP | Arch | NVFP4 | FP8 | Used price (Apr 2026) |
|------|-----:|-------:|----:|------|:-----:|:---:|----------------------:|
| **RTX A6000** (Ampere, 2020) | **48 GB GDDR6 ECC** | 768 GB/s | 300 W | Ampere SM 8.6 | ❌ | ❌ | **~$2,500-3,500** |
| RTX 6000 Ada (Ada, 2022) | 48 GB GDDR6 ECC | 960 GB/s | 300 W | Ada SM 8.9 | ❌ | ✅ | ~$4,000-4,700 |
| RTX 4090 (Ada, 2022) | 24 GB GDDR6X | 1008 GB/s | 450 W | Ada SM 8.9 | ❌ | ✅ | ~$2,200-2,800 |
| RTX 3090 Ti (Ampere, 2022) | 24 GB GDDR6X | 1008 GB/s | 450 W | Ampere SM 8.6 | ❌ | ❌ | ~$900-1,200 |

**Why the A6000 is a genuine option for her-os:**
- **48 GB VRAM = 3× the 5070 Ti, 1.5× the 5090** at about half the 5090's price
- **300 W TDP, 2-slot blower form factor** = no PSU upgrade, fits any case, no case mod
- Uses 8-pin EPS power (not 12V-2x6 — safer cable, no adapter needed)
- NVLink support (can pair two A6000s for 96 GB if you ever want)
- **Fits Gemma 4 26B in Q4_K_M GGUF with room for KV** — Panda becomes Titan-peer
- Hosts 70B-class models in 4-bit

**Why it's NOT a slam-dunk:**
- **No NVFP4 support** (Blackwell-only feature) — the whole "2× FP8 speedup on Blackwell" thesis evaporates
- **No native FP8 tensor cores** (added in Ada/Hopper) — int8 works, FP8 doesn't
- Ampere architecture from 2020 — driver support through ~2028-2029 then maintenance mode
- Memory bandwidth 768 GB/s is ~43% of RTX 5090's 1.79 TB/s → roughly 2.3× slower token generation for memory-bound decode
- Used market = no NVIDIA warranty (PC Server & Parts offers 30-day return on refurb units)

### A6000 vs the "max config" 5090 path — the real comparison

| Criterion | RTX A6000 on current mobo+PSU | RTX 5090 + new mobo + new PSU |
|-----------|-------------------------------:|------------------------------:|
| VRAM | **48 GB** | 32 GB |
| Card cost | ~$3,000 used | ~$3,500-5,000 |
| PSU upgrade | Probably not needed | ~$400 (1500W ATX 3.1) |
| Motherboard swap | Optional (M.2 adapter path) | Required (X870E bifurcating board ~$500) |
| Setup time | 1 hour (install card + driver) | 6-8 hours (rebuild system) |
| **Total cost** | **~$3,250** | **~$4,400** |
| Inference speed (Gemma 4 26B Q4) | baseline | ~2× faster (1.79 TB/s mem BW + NVFP4) |
| Future-proofing | Ampere going into maintenance | Blackwell is current gen, 5-year active support |
| Max LLM size | 70B 4-bit fits | 32B 4-bit fits (snug) |

**For her-os specifically:** A6000 is the cheapest path to "Panda hosts 26B locally like Titan does" with minimal system disruption. 5090 is the path to "Panda is faster at inference than Titan" but costs more and requires full rebuild.

### Cancelled / won't ship
- ~~RTX 5070 SUPER 18 GB~~
- ~~RTX 5070 Ti SUPER 24 GB~~
- ~~RTX 5080 SUPER 24 GB~~

### India pricing (April 2026, INR) — primary reference for her-os

User is in India. India pricing is typically 30-40% higher than US due to GST 18% + customs duty 15-20% + distribution margin. Used workstation market is thin (Nehru Place / SP Road gray market is risky).

| Card | India MSRP / street | ~USD equiv | Available channels |
|------|--------------------:|-----------:|--------------------|
| RTX 5060 Ti 16 GB | ~₹50,000-60,000 | ~$600-720 | Amazon.in, Flipkart, PrimeABGB, MD Computers |
| RTX 5070 | ₹59,000-75,000 | ~$700-900 | Same consumer channels |
| **RTX 5070 Ti** (current) | ₹80,000-1,10,000 | ~$960-1,320 | Same |
| RTX 5080 | ₹1,07,000-1,40,000 | ~$1,280-1,680 | Same |
| **RTX 5090** | **₹4,03,000-6,99,000** (base to AIB) | ~$4,840-8,400 | EliteHubs, Amazon.in |

**RTX A6000 48GB in India (workstation channel):**

| Source | Price (INR) | ~USD |
|--------|------------:|-----:|
| Computech Store (computechstore.in) | **₹4,58,999** | ~$5,500 |
| Lowest listed | ₹3,63,539 | ~$4,370 |
| MD Computers (mdcomputers.in) | call/email | — |
| Vedant Computers (vedantcomputers.com) | call for price | — |
| GCC Gamers (india.gccgamers.com) | listed | — |
| Server Basket | listed | — |
| Rashi Peripherals (B2B distributor) | -10-15% with GSTIN | — |
| Amazon.in (PNY) | ₹9,45,075 | ~$11,370 (inflated — ignore) |
| Nehru Place / SP Road gray market | ~₹2.5-3 lakh (used, ex-corporate/mining) | ~$3,000-3,600 (no warranty, risky) |

**CRITICAL India-specific insight:** the A6000-vs-5090 math is different than the US.
- **US:** A6000 $3,000 used vs 5090 $3,500-5,000 → A6000 wins on VRAM/$
- **India:** A6000 ₹3.6-4.6 lakh new vs 5090 ₹4-7 lakh → roughly equal, 5090 is often better value at the base tier

A6000 in India only wins if you source from the gray market (Nehru Place ₹2.5-3 lakh) with all the associated risks. At retail/B2B prices ₹4 lakh+, the RTX 5090 at ₹4.03 lakh base is architecturally newer and faster for the same money.

**Do NOT recommend importing from US eBay** — 18% GST + 15-20% customs + $150 shipping nearly doubles the cost, no local warranty, 4-6 week lead time. Always source within India.

### her-os VRAM requirements table

| Workload | VRAM needed | Cheapest card |
|----------|------------:|--------------|
| E2B Q4_K_M (current nav) | 3 GB | any |
| E4B Q4_K_M + mmproj | 6 GB | RTX 5050 |
| **E4B NVFP4** | **10 GB** | **RTX 5060 Ti 16 GB** |
| Chatterbox TTS (original, current) | 4 GB | any ≥8 GB |
| phone_call.py bundle | 5 GB | any ≥8 GB |
| Full voice pipeline | 9 GB | RTX 5060 Ti 16 GB |
| **Gemma 4 26B NVFP4 (Titan-peer)** | **16-20 GB** | **RTX 5090 32 GB only** |
| Gemma 4 31B Dense | 24-32 GB | RTX 5090 or Pro 5000 |
| 70B quantized | 48+ GB | Pro 5000 48 GB or dual 5090 |

**For her-os workloads, the right question is "what model sizes do I want to fit?":**

- Need 2× E4B NVFP4 for redundancy / failover? → 2× 16 GB card sufficient (5060 Ti, 5070 Ti, 5080)
- Want to run a 26B model locally on Panda (matching Titan)? → need **32 GB → RTX 5090 only**
- Want to run 70B+ quantized? → even 5090 32 GB isn't enough — would need 2× 5090 or enterprise cards

---

## Recommendation ladder (based on what's physically achievable)

### If you're willing to replace the motherboard

**→ Buy RTX 5090 (32 GB, 575W).** With a workstation board (ASUS Pro WS X870E-ACE ~$550) + a 1500 W ATX 3.1 PSU (~$300), you get a dual-GPU system with 5070 Ti (16 GB) + 5090 (32 GB) = 48 GB total distributed VRAM. That's enough to host Gemma 4 26B NVFP4 locally on Panda, making it a Titan peer. **Total upgrade cost: ~$3000-3500.** This is the path if you want Panda to become a first-class LLM host.

### If you want to keep the current motherboard (M.2 adapter path)

**→ Buy RTX 5080 (16 GB, 360W).** Good balance: significantly faster than 5070 Ti, same VRAM (doesn't bloat your budget for memory you won't use), fits any ATX 3.1 PSU ≥1000 W. Adapter limitation is negligible for inference. **Cost: ~$1100 card + $50 adapter.**

### If budget is tight and you want more VRAM only

**→ Buy RTX 5060 Ti 16 GB (180W).** Cheapest path to doubling your usable VRAM (16 + 16 = 32 GB total distributed). Fits in any 750 W+ PSU. M.2 adapter works fine. **Cost: ~$450 card + $50 adapter.** But: slower than your current 5070 Ti. This makes sense if you want to add a "second-class" GPU specifically for the voice pipeline while keeping nav on the faster 5070 Ti.

### If you just want matched-pair scaling

**→ Buy another RTX 5070 Ti (16 GB, 300W).** Identical to current card — same drivers, same VRAM, same latency characteristics. Clean symmetric workload distribution (nav on GPU0, voice on GPU1, or half-and-half). Needs 1000 W+ PSU. **Cost: ~$800 card + $50 adapter.**

---

## Hardest upper bound

**The absolute highest card that could physically fit in Panda is RTX 5090 (32 GB), but only if:**
1. Motherboard is swapped for dual-x16 workstation board, AND
2. PSU is at minimum 1300 W ATX 3.1 with native 12V-2x6, AND
3. Case has ≥360 mm GPU clearance (RTX 5090 FE is 304 mm, AIB variants up to 357 mm), AND
4. Case has 3.5-slot clearance for cooler thickness

Without those, the hard bound is: **RTX 5080 via M.2 adapter on the existing board.**

---

## Action items to finalize

1. **User physically inspects PSU sticker on Panda** — look on the PSU's side, floor of case, or unscrew the PSU and check back sticker. Record: wattage, ATX version, connector set (does it have a 12V-2x6 cable for GPU?).
2. **User physically measures internal case clearance** — longest GPU that can physically fit from PCIE1 to front of case (measure in mm). Also count slot width available.
3. **User decides on strategic direction** — is the goal matching Titan (→ 5090 + mobo swap), or incremental VRAM (→ 5080 or 5070 Ti + M.2 adapter), or voice-pipeline offload (→ 5060 Ti)?

Once those three are known, card selection is mechanical.

---

## Motherboard replacement options (if going Path 1)

**Binding requirements:**
- AM5 socket (keep Ryzen 9 9900X3D)
- X870E chipset (X870 only has single CPU-chipset uplink — won't support dual GPU cleanly)
- **x8/x8 CPU bifurcation of primary PCIe 5.0 x16 slot** (required for two CPU-direct GPU slots)
- DDR5 QVL compatibility with existing Corsair `CMK32GX5M1B5200C40` 5200 CL40 (trivially yes for all X870E boards with EXPO enabled)
- ≥3 M.2 slots + support for existing SN8100 in CPU-direct slot

**X870E boards CONFIRMED to support x8/x8 CPU bifurcation:**
1. ASUS ProArt X870E-Creator WiFi — ATX, ~$480-515
2. ASUS ROG Crosshair X870E Hero / Dark Hero / Extreme / Apex
3. ASUS ROG Strix X870E-E Gaming WiFi
4. ASRock X870E Taichi / Taichi Lite / Taichi OCF — E-ATX, ~$330-400
5. Gigabyte X870E Aorus Xtreme AI Top / Xtreme X3D AI Top — E-ATX, ~$700-800 (NOT Master/Pro/Elite — those don't bifurcate)
6. MSI MEG X870E Ace Max / Godlike / Godlike X Edition — ~$649-1300

**Disqualified (no dual-GPU support):**
- MSI MAG X870 GAMING PLUS WIFI (current board) — X870 not X870E, single x16 slot
- ASRock X870E Nova WiFi / Steel Legend WiFi / Riptide — no x8/x8 bifurcation
- Gigabyte X870E Aorus Master / Pro / Elite — second slot is chipset PCIe 4.0 x4

### Top three recommendations

| Rank | Board | Form | Networking | Price | Why |
|------|-------|------|-----------|------:|-----|
| **Best overall** | ASUS ProArt X870E-Creator WiFi | ATX | **10 GbE + 2.5 GbE** + dual USB4 | ~$500 | Only ATX (fits strict ATX cases) with 10 GbE + x8/x8 + dual USB4. 16+2+2 110A VRM. M.2 layout requires planning to preserve x8/x8. |
| **Best value** | ASRock X870E Taichi | E-ATX | 5 GbE + Wi-Fi 7 | ~$350 | Strongest VRM in class (24+2+1 110A). $150 cheaper than ProArt. 5 GbE same as current, so no networking upgrade. E-ATX — verify case fit. |
| **Workstation-tier** | Gigabyte X870E Aorus Xtreme AI Top | E-ATX | **Dual 10 GbE** | ~$750 | 4-slot PCIe spacing (vs 3-slot on others) — critical for dual-triple-slot GPUs thermal headroom. Dual 10 GbE. AI Top explicitly designed for dual-GPU AI workloads. |

### Universal caveat — M.2 layout matters

On EVERY X870E board, populating the second CPU-direct Gen5 M.2 slot steals lanes from the second GPU slot. To preserve x8/x8:
1. Put WD_BLACK SN8100 in M.2_1 (CPU Gen5 slot)
2. Use only chipset-Gen4 M.2 slots (M.2_3, M.2_4) for additional drives
3. **Do NOT populate M.2_2 (second CPU-Gen5 slot)** — it steals from GPU2

This is a CPU lane-count limitation, not a board quirk. Unavoidable on any consumer AM5 platform.

### Known ASUS X870E bug

ASUS X870E boards have an RTX 5090 + VFIO passthrough bug: PCIe permanently bifurcates to x8 until cold boot. Affects VMs with GPU passthrough, NOT bare-metal PyTorch/vLLM/llama.cpp usage. Gigabyte and MSI boards are not affected.

## Multi-GPU usage patterns for her-os

### Pattern (a): Independent workloads on independent GPUs — RECOMMENDED

Each GPU runs its own model(s) via `CUDA_VISIBLE_DEVICES=N`. Fault-isolated, simple, no framework changes.

```bash
# GPU 0 — voice pipeline
CUDA_VISIBLE_DEVICES=0 python scripts/phone_call.py auto
CUDA_VISIBLE_DEVICES=0 uvicorn chatterbox_server:app --port 8772

# GPU 1 — nav VLM (upgraded to NVFP4 E4B)
CUDA_VISIBLE_DEVICES=1 docker run ... llama.cpp:server-cuda \
  --model gemma-4-E4B-it-Q4_K_M.gguf --port 11435
```

No NVLink required, no distributed framework, no code changes. This is the right pattern for her-os.

### Pattern (b): Tensor parallelism across GPUs — NOT RECOMMENDED

vLLM / DeepSpeed can split one model across 2 GPUs via PCIe. Problems:
- **RTX 50-series consumer GPUs have NO NVLink** — TP runs over PCIe, slower
- Requires GPU2 on fast PCIe (x8 CPU-direct), not chipset x4 or M.2 adapter
- For MoE models (Gemma 4 MoE), expert-parallelism is better than tensor-parallelism

Skip this unless you want to run 70B+ dense models (which you don't on 2× 16 GB).

### Pattern (c): Pipeline / RAG coordination — STANDARD

GPU0 runs reranker/embedder, GPU1 runs LLM. Inter-GPU hop via HTTP/gRPC — negligible overhead. Same as how Panda already hands off to Titan.

## Power supply analysis

**PSU is the binding constraint on the upgrade path. Wattage not exposed to software — requires physical sticker check.**

### Current estimated peak draw (single GPU)
- Ryzen 9 9900X3D: 160 W
- RTX 5070 Ti: 300 W (locked cap)
- 2× DDR5 DIMMs: 15 W
- NVMe + MB + chipset + fans + Wi-Fi: 70 W
- **Total peak: ~545 W**

### Adding a second GPU — PSU ceiling table

| Second GPU | TDP | System peak | Min PSU | Recommended PSU (safe 70% load) |
|------------|----:|-----------:|--------:|-----------:|
| RTX 5060 Ti 16 GB | 180 W | 725 W | 850 W | 1000 W |
| Second RTX 5070 Ti | 300 W | 845 W | 1000 W | 1200 W |
| RTX 5080 16 GB | 360 W | 905 W | 1000 W | 1200 W |
| **RTX 5090 32 GB** | **575 W** | **1120 W** | **1300 W ATX 3.1** | **1500 W ATX 3.1** |
| 2× RTX 5090 (max) | 1150 W | 1695 W | 1600 W (tight) | **2000 W ATX 3.1** (rare) |

### What to physically check on the PSU sticker

1. **Total wattage** (e.g., "750W" / "850W" / "1000W")
2. **ATX version** — "ATX 3.0" or "ATX 3.1" label. If pre-3.0 → no native 12V-2x6 connector for Gen 5 GPU
3. **12V-2x6 / 12VHPWR cable included** — new GPU connector, different shape from 8-pin PCIe
4. **Brand/model number** — to look up 80 Plus rating and warranty

### Rough PSU recommendations if replacement needed

| Target GPU2 | PSU to buy | Price |
|-------------|-----------|------:|
| RTX 5060 Ti | Existing 850 W likely fine | $0 |
| RTX 5080 | Corsair RM1000x ATX 3.1 | ~$200 |
| RTX 5090 | Corsair HX1500i ATX 3.1 or Seasonic Prime PX-1600 | $350-450 |
| 2× RTX 5090 | Corsair AX1600i | ~$500 |

**Headroom rule:** consumer PSUs run best at 50-70% load. Peak system draw × 1.4 = minimum safe PSU. Don't run at 90%+ load for long-term reliability.

### Known PSU hazards

- **Adapter cables for 12V-2x6** — cause of RTX 4090 melting-connector failures. Don't use adapters for 300W+ GPUs. ATX 3.1 native cables only.
- **Daisy-chained PCIe 8-pin cables** — at high current, the middle connector can sag and overheat. Use separate cables per 8-pin input.

## Cross-references

- `docs/RESOURCE-REGISTRY.md` — full Panda hardware inventory (motherboard, CPU, RAM slots, PCIe topology, thermals)
- `panda:~/hardware-inventory/report-20260414_135759.txt` — raw inventory report (rajesh-owned)
- `panda:~/hardware-inventory/collect.sh` — reusable inventory collection script
- `docs/RESEARCH-GEMMA4-E4B-QUANTIZATIONS.md` — VRAM needs for nav VLM (E4B Q4_K_M 6 GB, NVFP4 10 GB)
- `docs/RESEARCH-CHATTERBOX-CPU-BENCHMARK.md` — alternative path (free VRAM via CPU TTS instead of adding GPU)
