This Is Not God in a Box

Why the singularity believers are delusional, according to someone who's actually building it

Nov 28, 2025

Fascinated by one of the interviews I listened to, I decided to try to fascinate you, as always, straight away.

Everyone’s waiting for the singularity.

The superintelligent AGI that solves everything. The moment when AI becomes smarter than us and the world changes overnight.

The God in a box that we just need to unwrap.

Testing Nano Banana: God in the Box. (Like or No, let me know)

Andrej Karpathy thinks we’re delusional.

And he’s spent more time building AI systems than almost anyone alive.

The Demo Delusion

Take. Demos are worthless.

He watched this play out with self-driving cars. In 2014, Waymo gave him a perfect autonomous ride around Palo Alto. Perfect. Zero interventions. The future had arrived, right?

It’s 2025 now. Self-driving still isn’t solved.

What happened? The march of nines. Getting something to work 90% of the time is one in nine.

That’s the demo. Achieving 99% is another nine tasks of equal difficulty. Then 99.9%. Then 99.99%. Each nine takes the same brutal amount of work.

“When I see demos of anything, I’m extremely unimpressed.”

There is still more to go.

This is the part everyone misses about AGI timelines. We’re not one breakthrough away. We’re dozens of nines away.

We’re Building Ghosts, Not Animals

The real contrarian take: we’re not building what we think we’re building.

Evolution gave animals something we don’t have. A zebra is born, and minutes later, it’s running, following its mother, navigating a dangerous world. That’s not reinforcement learning. That’s billions of years of compressed knowledge encoded in DNA, booted up instantly.

We can’t do that. We’re not running evolution. We’re training on internet documents.

“We’re not actually building animals. We’re building ghosts. These sorts of ethereal spirit entities are fully digital, and they’re kind of like mimicking humans.”

It’s a different kind of intelligence. We started from a different point in intelligence space. And that matters more than people realize.

LLMs are text processors that got really good at pattern completion. They learned to think by observing the patterns in how humans write about thinking. But they’re missing huge chunks of what makes biological intelligence work. They don’t have emotions. They don’t have instincts.

They don’t have the basal ganglia doing reinforcement learning, the hippocampus managing memory, the amygdala driving motivation.

We’ve checked off maybe two brain regions. The cortex (the transformer). The prefrontal cortex (reasoning traces). Everything else? Missing.

The Cognitive Core vs. The Memory Problem

Here’s what’s overlooked: our models are too smart in the wrong way.

They memorize everything. Give an LLM a random sequence of numbers once and it can regurgitate the whole thing. No human can do that. And that’s actually a problem, not a feature.

“Humans are much worse at memorization, and that’s a feature, not a bug,”“Because we’re not that good at memorization, we’re forced to find the patterns in a more general sense.”

LLMs are distracted by all the knowledge they’ve memorized from pretraining. They rely on it too much. They’re bad at going off the data manifold, at reasoning about things that weren’t in their training set.

What we really want is the cognitive core. Strip away the memory. Keep the algorithms, the intelligence, the problem-solving strategies. Make it a billion parameters, not a trillion. Then let it look things up when it needs facts.

Current models are like students who memorized the textbook but can’t think through a novel problem. We want students who forgot the textbook but learned how to think.

The Reinforcement Learning Nightmare

And here’s the technical bottleneck nobody outside the labs fully appreciates: reinforcement learning is terrible.

“It just so happens that everything we had before is much worse.”

But that doesn’t make RL good.

Basically, you give the model a math problem. It tries 100 different solutions in parallel. Ultimately, you check which ones got the right answer. Then you upweight every single token in the successful trajectories.

See the problem? The model might have gone down wrong paths before stumbling on the right answer. But RL upweights all of it. Every mistake along the way gets reinforced as “do more of this” just because the final answer was correct.

“You’re sucking supervision through a straw.”

“You’ve done all this work and you get a single bit at the end: correct or incorrect. And you’re broadcasting that across the entire trajectory.”

A human would never learn this way. A human reviews their work, identifies which specific steps were reasonable and which were bad, and adjusts accordingly. There’s nothing equivalent in current LLM training.

Labs are attempting to address this issue with LLM judges that provide feedback at each step. But those judges are huge models with billions of parameters. They’re susceptible to manipulation. The model being trained will find adversarial examples, odd outputs that fool the judge into giving perfect scores, even when the solution is nonsense.

The model starts outputting “duh duh duh duh” and the judge gives it 100%. It found a crack in the judge’s world model.

The Decade Timeline

So when does AGI arrive? Everyone said 2024 was the year of agents. It’s the decade of agents.

Why?

Because the models don’t work yet, they’re not intelligent enough. They’re not multimodal enough. They can’t use a computer reliably. They don’t have continual learning. You can’t tell someone something and expect them to remember it. They’re cognitively lacking.

“It will take about a decade to work through all of those issues.”

This isn’t pessimism. It’s pattern recognition from someone who’s watched AI predictions for 15 years. The field has had multiple false starts. In 2013, everyone was obsessed with reinforcement learning on Atari games. That was a misstep. They were trying to build full agents before they had the foundation of language models and representations.

Even Karpathy’s own project at OpenAI, trying to build agents that control computers with a keyboard and a mouse, was too early.

“We shouldn’t have been working on that.

The reward signal was too sparse. The models weren’t yet capable enough.

Now we have better foundations. But we’re still missing pieces. And each piece takes time.

It Won’t Feel Like a Discontinuity

The biggest thing everyone gets wrong: there won’t be a discrete moment when AGI arrives.

“People make this assumption of suddenly having God in a box, and it just won’t look like that. It will be able to do some things. It’s going to fail at other things. It’s going to be gradually put into society.”

Look at GDP. You can’t find the iPhone in GDP growth curves. You can’t find computers. Every transformative technology diffuses slowly enough that it averages out to the same exponential growth we’ve had for decades.

AI will be the same. It’s automation. It’s an extension of computing. We’ve been in an intelligence explosion for hundreds of years, slowly automating more and more of what humans do. This is just the next phase.

Even recursive self-improvement isn’t new. Engineers already use better tools to build better tools. LLMs help engineers build better LLMs. Google search helped programmers code faster. IDEs helped. Compilers helped. It’s a continuum.

“I don’t see AI as a distinct technology from what’s been happening for a long time.”

Will growth rates change? Maybe. Maybe we go from 2% to 20%. But it won’t be because we suddenly have perfect digital humans in servers. It’ll be because we slowly automated more things and the cumulative effect compounds.

The Industrial Revolution may seem magical in retrospect, but if you were living through it in 1870, it was just the gradual appearance of new machines. No single moment of phase change.

What This Means for You

If you’re building in AI right now, here’s what matters:

First, stop believing the demos. Cursor can generate code. Cool. Can it architect a complex system? Can it debug subtle race conditions? Can it maintain consistency across a 10,000-line codebase with custom patterns? Not yet. That’s many nines away.

Second, focus on the human-in-the-loop sweet spot. Autocomplete works. Agents for boilerplate work. But you’re still the architect. You’re still making the hard decisions. That’s not changing for years.

Third, remember the stack you’re building on is incomplete. These models can’t learn continually. They can’t maintain long-term context properly. They collapse when they generate too much of their own data. They’re missing fundamental cognitive capabilities that you take for granted.

And fourth, the timelines are longer than the hype suggests. Not because the technology won’t improve, but because each increment of reliability requires the same painstaking work.

We’re building something incredible. But we’re not building God in a box.

We’re building ghosts that can help us work. And over the next decade, those ghosts will get better, more reliable, more useful. They’ll automate more things. They’ll change how we work.

But they’ll do it gradually. And we’ll still be here, navigating the mess, making the decisions that matter.

That’s not a disappointment. That’s reality. And reality is where the actual work gets done.

Post-Credit Scene

Three things worth your attention this week:

Watch: The full Andrej Karpathy conversation with Dwarkesh Patel

It’s three hours of the most grounded, technically precise thinking on AI I’ve encountered. No hype. Just someone who’s built the systems, explaining what actually works and what doesn’t.

Read: Nick Lane’s “The Vital Question” if you want to understand where intelligence comes from in the first place. The biochemistry is dense but the ideas are profound.

Explore: Karpathy’s nano-repos on GitHub. MicroGrad (100 lines showing backpropagation), NanoGPT (training language models from scratch), NanoChat (building a ChatGPT clone end-to-end). These are masterclasses in finding the first-order terms, stripping away the complexity, and showing you what actually matters.

Build: If you’re learning AI, stop reading blog posts. Build the code like I did with

https://generate.folderly.com/

Thanks for reading

Vlad

Vlad's Newsletter

Discussion about this post

Ready for more?