95 lines
3.4 KiB
Markdown
95 lines
3.4 KiB
Markdown
---
|
|
pageType: source
|
|
id: source.psb-thinking-2026-04-13
|
|
title: psb-thinking-2026-04-13
|
|
sourceType: local-file
|
|
sourcePath: /home/topher/.openclaw/workspace-psb-thinking/memory/2026-04-13.md
|
|
ingestedAt: 2026-05-03T01:55:56.660Z
|
|
updatedAt: 2026-05-03T01:55:56.660Z
|
|
status: active
|
|
growth: seed
|
|
---
|
|
|
|
# psb-thinking-2026-04-13
|
|
|
|
## Source
|
|
- Type: `local-file`
|
|
- Path: `/home/topher/.openclaw/workspace-psb-thinking/memory/2026-04-13.md`
|
|
- Bytes: 2828
|
|
- Updated: 2026-05-03T01:55:56.660Z
|
|
|
|
## Content
|
|
```text
|
|
# 2026-04-13
|
|
|
|
## GPU Install Attempt — CasaOS Goes Down
|
|
|
|
### What happened
|
|
- Topher attempted to install the Nvidia P102-100 10GB GPU (passive mining card, GTX 1080 Ti chip) into the CasaOS server (media, 100.91.1.57)
|
|
- With GPU installed: system boots all the way to terminal login prompt (media login:) — so POST works
|
|
- BUT: CasaOS dashboard does not load in browser
|
|
- Without GPU: system presumably works normally (pulled card to reboot)
|
|
|
|
### Driver state on media (pre-GPU install, checked via exec)
|
|
- Driver 560.35.03 installed (nvidia-driver-560, linux-modules-nvidia-560-6.11.0-24-generic)
|
|
- /dev/nvidiactl exists — kernel driver loaded
|
|
- Ollama container: big-bear-ollama-cpu (CPU-only, no GPU access)
|
|
- CasaOS: running, HTTP 200 on curl to localhost
|
|
- Port 80: listening
|
|
|
|
### Theories
|
|
1. GPU changes PCI bus order or eth0 interface naming → network binding issue
|
|
2. nouveau driver grabbing card → system stall or display conflict
|
|
3. CasaOS binding to wrong interface after GPU install (looks for eth0, finds something else)
|
|
4. driver not fully compatible with P102-100 (compute card, not a standard GPU)
|
|
|
|
### PSU - ROOT CAUSE
|
|
- Current PSU: cheap 430W (INSUFFICIENT)
|
|
- Ordered: quality 600W PSU (arriving soon)
|
|
- The 430W cheap unit likely caused Postgres protection fault and instability
|
|
- P102-100 needs 150W, system pulls ~390W — no headroom on cheap 430W
|
|
|
|
### Next steps (Topher to run with GPU installed)
|
|
1. `ping media` — does hostname resolve?
|
|
2. `ss -tlnp | grep :80` — is port 80 listening?
|
|
3. `curl -I http://localhost/` — does CasaOS respond locally?
|
|
4. `dmesg | grep -i nvidia | tail -20` — GPU kernel messages
|
|
5. `dmesg | grep -i error | tail -10` — any errors
|
|
6. Check if eth0/swac0 changed names: `ip a`
|
|
|
|
### After new PSU arrives
|
|
1. Remove GPU before PSU swap (safety)
|
|
2. Swap PSU, verify boots clean WITHOUT GPU
|
|
3. Reinstall GPU, boot — test CasaOS
|
|
4. If Postgres crashes again: `sudo systemctl restart postgresql`
|
|
5. Then restart Ollama container with GPU support once stable
|
|
|
|
### Memory search
|
|
- Currently DISABLED — Ollama CPU embeddings too slow (21s/chunk), waiting for GPU
|
|
- Qdrant: running on media at 100.91.1.57:6333
|
|
- Ollama: CPU container, needs GPU restart after card is confirmed working
|
|
|
|
### Agents
|
|
- psb-thinking (me): technical research, planning, system admin
|
|
- psb-gemma: brewery operations, day-to-day
|
|
- psb-business: business/reports/Toast POS
|
|
|
|
## 23:26 UTC - Post-reboot check (GPU install)
|
|
- Media rebooted ~22:56 UTC (~2hrs after warning at 21:42)
|
|
- System up 30 min, load 0.03 — stable
|
|
- **P102-100 NOT detected** — only Quadro K600 at 01:00.0
|
|
- nvidia-smi: "couldn't communicate with NVIDIA driver" (Quadro K600 has no driver)
|
|
- No /dev/nvidia* devices found
|
|
- P102-100 may not have been physically installed, or POST failure, or PCIe lane issue
|
|
|
|
```
|
|
|
|
## Notes
|
|
<!-- openclaw:human:start -->
|
|
<!-- openclaw:human:end -->
|
|
|
|
## Related
|
|
<!-- openclaw:wiki:related:start -->
|
|
- No related pages yet.
|
|
<!-- openclaw:wiki:related:end -->
|