Out of the browser: on the power of interfaces for agentic AI

Ben Swift

When OpenAI launched ChatGPT on November 30, 2022, it was a text box in a browser. That conversational call-and-response interface was a big part of what made it a hit—the fastest-growing technology product in history, reaching 100 million users in two months. But that’s not what this post is about.

This post is about why “AI coding in the terminal” is such a big deal, and why it’s arguably more about interfaces than about models.

The browser as a security marvel

Modern web browsers are incredible feats of sandboxing engineering. When you visit a website, that site’s code runs in a carefully isolated environment. It can’t read your files. It can’t run programs on your computer. It can’t even reliably^[1] visit other websites on your behalf. This isolation is achieved through a multi-process architecture where untrusted web content runs in restricted processes that can only communicate with the rest of your system through tightly controlled channels.

This sandboxing is why you can visit sketchy websites and (mostly) not worry about them stealing your passwords or deleting your files. What happens in the browser stays in the browser.

But here’s the thing: that same sandboxing that protects you also constrains what ChatGPT (or any browser-based AI) can actually do. It can generate text. It can show you images. It can even run some code (JavaScript) in that sandboxed environment. But it can’t create files on your computer, run your test suite, commit code to git, or do any of the thousand other things you do when you’re actually building software.

Enter the terminal

The terminal is almost the anti-browser. It’s a text-only interface to doing everything on your computer. Creating, reading, editing, and deleting files (or entire hard drives). Running programs, installing software, accessing the network (and potentially exfiltrating sensitive documents). The terminal is powerful—and dangerous—precisely because it has no sandbox.^[2]

This is why software developers use it. Building software involves a constant cycle of editing files, running compilers, executing tests, managing version control, and deploying to servers. All of these are terminal operations. And crucially, the people using the terminal are expected to know what they’re doing and to take responsibility for the commands they run.

What makes the terminal particularly interesting for LLMs is this: it’s all text. Both the inputs (commands with a specific syntax) and the outputs (error messages, logs, success confirmations) are designed to be read by humans—specifically, programmers who need to understand what’s happening and debug when things go wrong.

LLMs are really good at text. They can understand error messages, reason about what went wrong, and generate the next command to try. And because the terminal runs commands quickly, you can put an LLM in a loop: try something, see the result, adjust, repeat. This is what Anthropic’s whitepaper on building effective agents calls tightening the feedback loop—letting the model iterate without waiting for a human to approve every step.

The terminal-native AI boom

When Anthropic announced Claude Code in February 2025, the model update wasn’t a quantum leap—Claude 3.7 Sonnet came with it, but Sonnet 3.5 was already pretty capable. What they released was a new interface: an agentic command-line tool that could search and read code, edit files, write and run tests, and commit to git. Same models^[3], different interface, dramatically more useful for actual software development. If you don’t believe me about how big a deal this is, check the socials.

And everyone else noticed. OpenAI launched Codex CLI in April 2025. Google followed with Gemini CLI. The open source community jumped in too: OpenCode now has over 95,000 GitHub stars, Pi (built by Armin Ronacher and Mario Zechner, powering OpenClaw) takes a minimal-agent approach, and Claudia wraps Claude Code in a GUI for those who want the power without the terminal aesthetic.

This is an innovation in interfaces, not just models.

The obvious risk

Of course, all this power comes with risk. An LLM running in your terminal can do anything you can do in your terminal. Including steal your SSH keys, read your .env files, or—as Johann Rehberger points out—wipe your production database.

So far, the frontier models have been pretty well-behaved^[4]. But Rehberger draws a sobering parallel to the Space Shuttle Challenger disaster and the concept of “normalisation of deviance”—when repeated exposure to risky behaviour without negative consequences leads people to accept that risk as normal.

As Simon Willison notes:

In the absence of any headline-grabbing examples of prompt injection vulnerabilities causing real economic harm, will anyone care?

The incentives for speed and automation are strong. The incentives for security are… well, they’re there in principle, but it’s easy to forget why the guardrails existed in the first place.

It’s not the models, it’s the interface

I’ve written before about how agentic AI is really just about giving LLMs tools—stones to throw, in the “sticks and stones” sense. But the terminal-native AI wave has clarified something for me: the power of agentic AI comes from the interface, not just the tools themselves.

The browser sandbox was always a security feature, not a limitation of the underlying AI. ChatGPT could always tell you to run rm -rf /—it just couldn’t do it itself. By moving to the terminal, we haven’t made the models smarter; we’ve given them permission to actually do things.

That’s both the promise and the peril. The models can now iterate without asking and do real work in tight feedback loops. But they can also make real mistakes with real consequences, in ways that browser-based AI never could.

If you’re going to use these tools—and if you’re a software developer in 2026, you probably should—just remember that you’re not using a smarter AI. You’re using the same AI with the safety guards removed. Plan accordingly.

CORS is a whole thing, but broadly speaking browsers are designed to prevent cross-origin resource shenanigans. ↩︎
Yes, you can run things in dev containers or VMs to limit the blast radius. But in practice, most developers (myself included) run these tools directly on their machines. See the normalisation of deviance discussion below. ↩︎
Ok, the models are getting better all the time. But I think the leap out of the browser is the main thing. ↩︎
Even though many developers run these tools with automatic approval for all actions (colloquially known as “YOLO mode”). I’m one of them 🙃 ↩︎

Out of the browser: on the power of interfaces for agentic AI

The browser as a security marvel ​

Enter the terminal ​

The terminal-native AI boom ​

The obvious risk ​

It’s not the models, it’s the interface ​

The browser as a security marvel

Enter the terminal

The terminal-native AI boom

The obvious risk

It’s not the models, it’s the interface