Vitalik Buterin proposes a locally run AI architecture, emphasizing privacy, security, and self-sovereignty, and warning about the potential risks of AI agents.
On April 2, Vitalik Buterin, the founder of Ethereum, published a long post on his personal website, sharing the AI work environment he designed around privacy, security, and self-sovereignty—everything runs local for LLM inference, all files are stored locally, and it is fully sandboxed; it intentionally avoids cloud models and external APIs.
At the very beginning of the article, he warns: “Please do not directly copy the tools and technologies described in this article, and assume that they are safe. This is just a starting point, not a description of a finished product.”
Vitalik points out that earlier this year, AI completed an important shift from “chatbots” to “agents”—you’re no longer just asking questions, but handing off tasks to let the AI think for a long time and call hundreds of tools to carry them out. He cites OpenClaw (currently the fastest-growing repo in GitHub history) and also calls out multiple security issues documented by researchers:
Vitalik emphasizes that his starting point on privacy is different from traditional cybersecurity researchers: “I come from a position deeply fearful of feeding a cloud AI my entire personal life—right when end-to-end encryption and local-first software finally became mainstream, we may be taking ten steps back.”
He set up a clear framework of security goals:
Vitalik tested three local inference hardware setups, mainly using the Qwen3.5:35B model together with llama-server and llama-swap:
| Hardware | Qwen3.5 35B (tokens/sec) | Qwen3.5 122B (tokens/sec) |
|---|---|---|
| NVIDIA 5090 laptop (24GB VRAM) | 90 | cannot run |
| AMD Ryzen AI Max Pro (128GB unified memory, Vulkan) | 51 | 18 |
| DGX Spark (128GB) | 60 | 22 |
His conclusion is: below 50 tok/sec is too slow, and 90 tok/sec is ideal. The NVIDIA 5090 laptop experience is the smoothest; AMD still has more edge-case issues, but is expected to improve in the future. High-end MacBooks are also valid options, though he personally hasn’t tried them.
About the DGX Spark, he puts it bluntly: “It’s described as a ‘desktop AI supercomputer,’ but in reality its tokens/sec is lower than a better laptop GPU—and you also have to deal with extra details like getting the network connection working. That’s pretty bad.” His advice is: “If you can’t afford a high-end laptop, you can pool with friends to buy a sufficiently powerful machine, place it somewhere with a fixed IP, and have everyone use remote connections.”
Vitalik’s article echoes an interesting parallel with the Claude Code security discussion released on the same day—while AI agents are entering everyday developer workflows, security issues are also moving from theoretical risk to real threats.
His core message is very clear: as AI tools become ever more powerful and can access your personal data and system permissions more and more, “local-first, sandboxed, and minimal trust” is not paranoia—it’s a rational starting point.