//nbkelley /homelab

Wiki Pipeline Scripts

Wiki Pipeline Scripts#

What Was Established#

Eight Python scripts in /opt/wiki/homelab/scripts/ implement the full wiki pipeline: file conversion, document ingestion, conversation crystallization (standard, DeepSeek, and Claude formats), shared LLM infrastructure, wiki health-checking, and knowledge-graph integration. All scripts were ported from the work wiki pipeline (itself developed 2026-04-21 → 2026-04-26) with homelab-specific infrastructure baked in.

crystallize.py (Claude format) uses a two-step LLM approach: gemma4:e2b cleans, qwen3.6:35b crystallizes. crystallize_deepseek.py skips gemma — JSON parsing is handled deterministically in Python (load_conversation + _clean_text), so only qwen is needed.

wiki-llm

wiki-llm#

What Was Established#

Dedicated Ubuntu 24.04 VM for hosting the homelab wiki system and Claude Code sessions. Chosen as a VM rather than an LXC for stronger isolation (wiki infrastructure handles credentials and embeddings).

Deployment#

Detail Value
Hostname wiki-llm
IP 192.168.1.206
VLAN Gandalf (192.168.1.x)
OS Ubuntu 24.04
CPU 2 cores
RAM 4 GB
Type VM (Proxmox)
SSH user iluvatar (sudo, PermitRootLogin no)

Access#

  • VS Code Remote SSH: Primary method for Claude Code sessions — VS Code connects to wiki-llm via remote SSH, giving Claude Code native filesystem access to /opt/wiki/homelab/
  • Direct SSH: ssh iluvatar@192.168.1.206

Wiki File Structure#

/opt/wiki/
  homelab/        ← git repo, all homelab wiki pages and skills
  work/           ← git repo, work/princelobel wiki + pipeline scripts
  projects/       ← git repo, projects wiki (planned)
  personal/       ← git repo, personal wiki (Gemma-only)
  raw-sources/    ← symlink to /mnt/wiki-nas/LLMWiki
  skills-reference/  ← clone of vanillaflava/llm-wiki-claude-skills (reference only)

Each wiki directory is an independent git repository (git init’d) for clean version history per namespace.

Troubleshooting DeepSeek Language Switching

Troubleshooting DeepSeek Language Switching#

What Was Established#

Local DeepSeek models may intermittently switch from English to Chinese mid-response. This is typically caused by training bias (heavy Chinese dataset influence), loss of context during long conversations, or mixed-language input prompts.

Key Decisions#

To maintain English-only responses, the following parameters and prompting strategies should be applied:

  • Explicit Instruction: Always include a system-level or initial prompt instruction to respond exclusively in English.
  • Temperature Control: Use lower temperature settings (e.g., 0.3) to make the model more deterministic and less likely to drift.
  • Repetition Penalty: Implement a repetition_penalty (e.g., 1.2) to discourage the model from falling into repetitive patterns that might trigger language switching.

Current Configuration#

System Message Pattern#

When using APIs or local inference engines that support system roles: