//nbkelley /homelab

Wiki System - Architecture#

What Was Established#

The wiki system is designed around the LLM wiki pattern (Karpathy): raw sources (chat transcripts, notes, docs) are crystallized into structured markdown pages, embedded into pgvector, and retrieved semantically by agents in future sessions. A dedicated LXC (nk-wiki) will host the wiki VM, separating wiki infrastructure from other services.

Multi-Wiki Namespace Design#

Three wikis are planned, each with its own namespace in pgvector:

Wiki Namespace Audience Isolation
homelab general Claude Code + Gemma Cross-pollinates with projects
projects projects Claude Code + Gemma Cross-pollinates with homelab
personal personal Gemma only Fully isolated — Claude Code never reads/writes

Cross-pollination: homelab and projects wikis share a pgvector query space. A query about a homelab project can surface relevant homelab infrastructure pages and vice versa. Personal wiki is isolated — never queried alongside the others.

Planned: nk-wiki LXC#

A dedicated LXC for the wiki VM is planned but not yet created.

Detail Planned Value
LXC name nk-wiki
OS Ubuntu 24.04
CPU 2 cores
RAM 4 GB
Disk 64 GB
VLAN Gandalf (192.168.1.x)
IP TBD
Purpose Host wiki VM, run Claude Code sessions, store /opt/wiki/

Proxmox has ~31.5 GB free RAM headroom as of 2026-04-19, so 4 GB allocation is comfortable.

Current Deployment#

The wiki runs on wiki-llm (192.168.1.206) — a dedicated Ubuntu 24.04 VM on Proxmox (Gandalf VLAN, 2 cores, 4 GB RAM). This was purpose-built as a VM (not LXC) for stronger isolation.

Filesystem layout:

/opt/wiki/
  homelab/        ← this repo (git)
  work/           ← work/princelobel wiki + pipeline scripts (git)
  projects/       ← projects wiki (git, planned)
  personal/       ← personal wiki (git, Gemma-only)
  raw-sources/    ← symlink → /mnt/wiki-nas/LLMWiki
  skills-reference/  ← vanillaflava/llm-wiki-claude-skills reference clone

Skills location: /opt/wiki/homelab/skills/ — all five skills rewritten for this homelab.

Pipeline scripts: /opt/wiki/homelab/scripts/ — five Python scripts for file conversion, document ingestion, and conversation crystallization. See Wiki Pipeline Scripts for full documentation.

Claude Code access: VS Code Remote SSH to wiki-llm. Claude Code gets native filesystem access to /opt/wiki/homelab/ without any additional setup.

Raw source ingestion: Synology NAS (LonelyMountain, 192.168.1.137) shares Documents via SMB. Mounted at /mnt/wiki-nas; symlinked to /opt/wiki/raw-sources. See wiki-llm for full mount configuration.

pgvector Integration#

  • Embedding model: nomic-embed-text via Ollama at http://192.168.2.192:11434/api/embeddings
  • Database: homelab on PostgreSQL LXC (192.168.1.57:5432)
  • Table: wiki_embeddings
  • Namespace filtering: WHERE namespace = 'general' AND wiki = 'homelab'

Pages are embedded after every create/update. The embedding call is made against the Pavilion Ollama instance.

Critical finding (2026-05-01): Frontmatter must be stripped before embedding. The ~200 chars of structurally identical YAML frontmatter on every page produced 0.88 cosine similarity between any two pages’ frontmatter blocks alone. Full-page embeddings with frontmatter included produced an average pairwise similarity of 0.725 (minimum 0.467), meaning all 179 pages clustered in a tight ball and the 0.5 similarity threshold barely filtered anything.

After stripping frontmatter and re-embedding 74 pages, average similarity dropped to 0.684 (minimum 0.358). Unrelated pages (e.g. n8n vs OPNsense) dropped from 0.82 → 0.60, while genuinely related pages (n8n vs PostgreSQL) held at 0.76. The 0.5 similarity threshold now meaningfully discriminates.

perform_full_integrate.py’s parse_page() strips frontmatter from content before embedding. embed_and_store() and search_similar() both work with body-only text. All pages were re-embedded on 2026-05-01.

Shared mechanical layer pattern: perform_full_integrate.py provides importable primitives (parse_page, search_similar, embed_and_store, add_wikilink, delete_embedding, append_log, upsert_index_entry) that wiki-integrate, wiki-restructure, and wiki-lint skills all import rather than duplicating. LLM agents (Claude Code) handle judgment; the script handles mechanics.

Git Repository Structure#

Each wiki is a separate git repository. This provides:

  • Clean version history per wiki
  • Independent rollback capability
  • No cross-contamination of personal and homelab commit history

Current repo: /opt/wiki/homelab/ is a git repo (remote TBD).

Agents#

Agent Host Role
Claude Code wiki-llm (192.168.1.206) Active session crystallization, complex queries, reasoning across pages
Gemma (gemma4:e2b on Celebrimbor) Celebrimbor (192.168.2.192) Scheduled transcript ingestion, bulk ingest, routine maintenance

Gemma runs wiki-crystallize Mode 2 on a schedule to process new transcripts from /mnt/transcripts/. After processing, transcripts move to ingested/chats/.

AI Infrastructure Overview, PostgreSQL, Pavilion (AI PC) Configuration, Wiki Pipeline Scripts, wiki-llm

Sources#

Homelab AI - 2026-04-19 · ingested/chats/2026-04-19-31-Homelab AI.json Homelab AI - 2026-04-20 · ingested/chats/2026-04-20-31-Homelab AI.json