Your own private AI, powered by llmcli and llama-server.
The Private AI Setup – H100 is a complete, production‑ready stack that runs entirely on your own hardware — no internet connection required for inference. It combines the llmcli command‑line agent (tool‑calling, session memory, prompt scripting) with the lightning‑fast llama-server inference engine, all secured by a hardened OpenBSD host. Perfect for organisations that cannot risk sending internal data to third‑party APIs.
What you get
- llmcli agent — full‑featured CLI agent: multi‑turn conversations, tool calling, state management, and prompt injection
- llama-server — the canonical llama.cpp inference server, CPU‑ and GPU‑friendly, zero cloud dependency
- Hardened OpenBSD host configuration — our production pf.conf ruleset, relayd TLS termination, and minimal attack surface
- Installation playbook — step‑by‑step from bare metal to a working llmcli session in under two hours
Why self‑hosted?
- Absolute data sovereignty — documents, emails, and conversations never leave your network.
- GDPR‑ready — no data processor agreement, no US‑based API, no third‑party risk.
- Air‑gap capable — law firms, clinics, and accountancy practices can run it completely offline.
- One‑time cost — no per‑token billing, no monthly API subscription.
Technical summary
| Agent | llmcli v100 |
| Inference backend | llama-server (llama.cpp) |
| Host OS | OpenBSD 7.6+ (hardened pf, relayd) |
| Deployment format | Downloadable ZIP — scripts, config templates, full documentation |
| Support | Community IRC (#sophia) + optional paid support retainer |
After purchase, you’ll receive a secure download link and a setup guide that takes you from a fresh server to chatting with your own llmcli agent in under two hours.
Private AI Setup - LLM100
- Product Code: LLM100
- Availability: Development
Tags: LLM100

