OpenClaw + Local LLM: Lots of Pain & Lots of Learning

This Was Supposed to Be Simple

Install OpenClaw. Point it at a local model. Let it run.

Reality had other plans.

While simple chat interaction worked, my quest for a true Jarvis-like wingman turned into configuration wack-a-mole, constant version changes, sandboxing pains, model quirks, and performance ceilings — a reminder of one of my former boss' many sayings …

"Software doesn't work by effing magic!"

What Is OpenClaw (And Why Run It Locally)

OpenClaw is an autonomous agent framework that connects LLMs to real-world tools and communication channels. It gets things done by defining SKILLS that leverage modern LLMs' ability to call "tools" — e.g. read my email inbox, determine importance, and prune accordingly.

I chose to run it locally for three reasons:

Privacy: No API keys. No prompt data leaving my machine.

Cost: Agent loops can burn through paid tokens fast. Local inference means predictable token costs. None.

Learning: I didn't just want to use AI — I wanted to understand it.

But there's something bigger driving me. Years ago, I worked at iGovTT and saw how the default government position was to heavily rely on external infrastructure and skill for all things tech — IBM, Microsoft, Cisco — all got big contracts, for long periods at the expense of the taxpayer. Some of that is necessary, but there are loads of areas where we can do great things ourselves. One goal of this experiment was to prove that: a fully autonomous AI stack, running entirely on my hardware.

What Was Such a Pain?

1. Configuration Friction

Environment variables, permission allow-lists, model configs, context window sizes, structured output formatting — all required careful alignment. Cloud APIs smooth over these edges. Local stacks expose them.

2. Weaker Local Models

Smaller models sometimes struggle with multi-step reasoning compared to frontier cloud models. Structured outputs become brittle. Tool calls occasionally fail silently, leaving you chasing ghosts in the machine for hours.

3. An Ever-Evolving System

I wasn't integrating two stable systems — I was integrating several moving targets. Agent frameworks evolve weekly. Models are numerous, with some better suited for different scenarios (OCR, TTS, STT). Even small updates can drastically shift behavior.

What I Learned

Cloud LLMs feel magical. You pay, plug in, and most of the time they just work. They're more robust at handling messy tool responses — JSON that isn't quite right, that .env file isn't exactly where it should be — they try different things; they work around it.

Local LLMs feel mechanical... but mechanical systems teach you more. Environment variables and config settings can change the way a model or even the entire inference engine behaves. Running locally forces you to confront hardware ceilings, model capability boundaries, and the limits of your own knowledge.

I walked away with better mental models — of inference engines, structured prompting and of the gap between expectation and reality. That understanding alone was worth the pain.

Advice If You're Attempting This

If you just want to be productive — don't. Pay your $20+/month for a frontier model, host it on a VPS, plug in your API keys, configure Telegram, and start talking.

If you absolutely want the full on-prem stack:

Start strong — use a capable model first before optimizing for size (gemma4 and qwen3.5 are great)
Use AI to debug AI — I use Claude Code from the terminal directly on my OpenClaw machine to read logs, diagnose issues, and write and/or execute test harnesses
Expect instability — don't go in thinking it'll just work
Go bare-metal — install directly on a machine where possible, like that old machine/server laying around
Resist the upgrade urge — OpenClaw moves at breakneck speed with multiple releases per month. Every update risks breaking your entire stack
Start small — get your TUI or web UI working, add Telegram or WhatsApp, then expand gradually

Would I Do It Again?

The honest answer: it depends on what you're optimizing for.

If you want a polished AI assistant today, the big players have caught up to what OpenClaw represents. OpenAI hired OpenClaw's creator. Claude launched Co-Work. Perplexity has Computer. Google has Project Mariner. There are loads of alternatives now.

But if you value privacy, ownership, and understanding — yes, it's worth it. Fair warning though: there were times I felt more like DevOps than a user. I have real work that I want the AI assistant to get done and being your own tech support while trying to be productive gets old fast.

That said, OpenClaw was built to put control in your hands. If you're going that route, you have to get your hands dirty. What you learn in the process — about AI, about systems, about your own limits — that's the real payoff.