Wiki System - Architecture#
What Was Established#
The wiki system is designed around the LLM wiki pattern (Karpathy): raw sources (chat transcripts, notes, docs) are crystallized into structured markdown pages, embedded into pgvector, and retrieved semantically by agents in future sessions. A dedicated LXC (nk-wiki) will host the wiki VM, separating wiki infrastructure from other services.
Multi-Wiki Namespace Design#
Three wikis are planned, each with its own namespace in pgvector:
| Wiki | Namespace | Audience | Isolation |
|---|---|---|---|
| homelab | general |
Claude Code + Gemma | Cross-pollinates with projects |
| projects | projects |
Claude Code + Gemma | Cross-pollinates with homelab |
| personal | personal |
Gemma only | Fully isolated — Claude Code never reads/writes |
Cross-pollination: homelab and projects wikis share a pgvector query space. A query about a homelab project can surface relevant homelab infrastructure pages and vice versa. Personal wiki is isolated — never queried alongside the others.
Planned: nk-wiki LXC#
A dedicated LXC for the wiki VM is planned but not yet created.
| Detail | Planned Value |
|---|---|
| LXC name | nk-wiki |
| OS | Ubuntu 24.04 |
| CPU | 2 cores |
| RAM | 4 GB |
| Disk | 64 GB |
| VLAN | Gandalf (192.168.1.x) |
| IP | TBD |
| Purpose | Host wiki VM, run Claude Code sessions, store /opt/wiki/ |
Proxmox has ~31.5 GB free RAM headroom as of 2026-04-19, so 4 GB allocation is comfortable.
Current Deployment#
The wiki runs on wiki-llm (192.168.1.206) — a dedicated Ubuntu 24.04 VM on Proxmox (Gandalf VLAN, 2 cores, 4 GB RAM). This was purpose-built as a VM (not LXC) for stronger isolation.
Filesystem layout:
/opt/wiki/
homelab/ ← this repo (git)
work/ ← work/princelobel wiki + pipeline scripts (git)
projects/ ← projects wiki (git, planned)
personal/ ← personal wiki (git, Gemma-only)
raw-sources/ ← symlink → /mnt/wiki-nas/LLMWiki
skills-reference/ ← vanillaflava/llm-wiki-claude-skills reference cloneSkills location: /opt/wiki/homelab/skills/ — all five skills rewritten for this homelab.
Pipeline scripts: /opt/wiki/homelab/scripts/ — five Python scripts for file conversion, document ingestion, and conversation crystallization. See Wiki Pipeline Scripts for full documentation.
Claude Code access: VS Code Remote SSH to wiki-llm. Claude Code gets native filesystem access to /opt/wiki/homelab/ without any additional setup.
Raw source ingestion: Synology NAS (LonelyMountain, 192.168.1.137) shares Documents via SMB. Mounted at /mnt/wiki-nas; symlinked to /opt/wiki/raw-sources. See wiki-llm for full mount configuration.
pgvector Integration#
- Embedding model:
nomic-embed-textvia Ollama athttp://192.168.2.192:11434/api/embeddings - Database:
homelabon PostgreSQL LXC (192.168.1.57:5432) - Table:
wiki_embeddings - Namespace filtering:
WHERE namespace = 'general' AND wiki = 'homelab'
Pages are embedded after every create/update. The embedding call is made against the Pavilion Ollama instance.
Critical finding (2026-05-01): Frontmatter must be stripped before embedding. The ~200 chars of structurally identical YAML frontmatter on every page produced 0.88 cosine similarity between any two pages’ frontmatter blocks alone. Full-page embeddings with frontmatter included produced an average pairwise similarity of 0.725 (minimum 0.467), meaning all 179 pages clustered in a tight ball and the 0.5 similarity threshold barely filtered anything.
After stripping frontmatter and re-embedding 74 pages, average similarity dropped to 0.684 (minimum 0.358). Unrelated pages (e.g. n8n vs OPNsense) dropped from 0.82 → 0.60, while genuinely related pages (n8n vs PostgreSQL) held at 0.76. The 0.5 similarity threshold now meaningfully discriminates.
perform_full_integrate.py’s parse_page() strips frontmatter from content before embedding. embed_and_store() and search_similar() both work with body-only text. All pages were re-embedded on 2026-05-01.
Shared mechanical layer pattern: perform_full_integrate.py provides importable primitives (parse_page, search_similar, embed_and_store, add_wikilink, delete_embedding, append_log, upsert_index_entry) that wiki-integrate, wiki-restructure, and wiki-lint skills all import rather than duplicating. LLM agents (Claude Code) handle judgment; the script handles mechanics.
Git Repository Structure#
Each wiki is a separate git repository. This provides:
- Clean version history per wiki
- Independent rollback capability
- No cross-contamination of personal and homelab commit history
Current repo: /opt/wiki/homelab/ is a git repo (remote TBD).
Agents#
| Agent | Host | Role |
|---|---|---|
| Claude Code | wiki-llm (192.168.1.206) | Active session crystallization, complex queries, reasoning across pages |
| Gemma (gemma4:e2b on Celebrimbor) | Celebrimbor (192.168.2.192) | Scheduled transcript ingestion, bulk ingest, routine maintenance |
Gemma runs wiki-crystallize Mode 2 on a schedule to process new transcripts from /mnt/transcripts/. After processing, transcripts move to ingested/chats/.
Related Pages#
AI Infrastructure Overview, PostgreSQL, Pavilion (AI PC) Configuration, Wiki Pipeline Scripts, wiki-llm
Sources#
Homelab AI - 2026-04-19 · ingested/chats/2026-04-19-31-Homelab AI.json
Homelab AI - 2026-04-20 · ingested/chats/2026-04-20-31-Homelab AI.json