In Today's Issue:

🤖 What recursive self-improvement actually is, and what it is not

📜 A short history of an old and dangerous idea

🏢 Why Anthropic and OpenAI are suddenly talking about it

📊 The evidence: benchmarks, coding agents, and a research project run by AI

🛑 The limits: reward hacking, model collapse, and the missing ingredient

🧠 The philosophy: when a tool becomes a participant in its own making

Dear Readers,

A note from us: University students receive our Saturday Deepdive for free when they register with their university email address at: https://getsuperintel.com/plus-whitelist

On June 4, 2026, Anthropic published a piece with a title that reads like a line from a science fiction screenplay: "When AI builds itself." The claim inside was more sober, and in some ways more unsettling, than the headline. For most of the field's history, the company wrote, humans drove every step of AI's development cycle. That is no longer true. As of May 2026, more than 80% of the code merged into Anthropic's own codebase was written by Claude, up from low single digits before the first Claude Code preview shipped in February 2025 (Anthropic, 06/04/2026). The typical engineer now merges roughly eight times as much code per day as in 2024. Put plainly, the systems are now doing a large share of the work that builds the next systems.

That sentence is easy to misread. It does not mean a model woke up, inspected its own weights, and rewrote itself overnight. It means something quieter and more structural: the work of making AI better, the coding, the debugging, the experiments, the evaluations, is itself becoming automated, and the systems doing that work are the same systems being improved. Researchers have a name for the endpoint of this trend. They call it recursive self-improvement, and it describes a future in which an AI system can autonomously design, train, and refine its own successor. Anthropic was careful to add the obvious caveat. We are not there yet, the authors stressed, and it is not inevitable. But it could come sooner than most institutions are prepared for (Anthropic, 06/04/2026).

The idea is old. The British mathematician I. J. Good sketched it in 1965, when he imagined an "ultraintelligent machine" that could design ever better machines, triggering what he called an intelligence explosion (Good, 1965). For sixty years the concept lived mostly in philosophy seminars and science fiction. What changed in the last eighteen months is that it migrated out of speculation and into the quarterly metrics of the companies actually building these systems. OpenAI now tracks "AI Self-improvement" as one of three formal risk categories in its Preparedness Framework, alongside biological and cyber threats (OpenAI, 04/15/2025). When two of the leading labs start measuring the same thing and writing safety policies around it, the question stops being whether the idea is interesting and becomes whether it is real.

So that is the question worth sitting with. We are clearly somewhere on a path. The harder problem is figuring out where exactly, how steep the road ahead is, and what it would mean if the center of gravity in AI research drifted, slowly or suddenly, from the people who build these systems toward the systems themselves.

All the best,

Kim Isenberg

What Recursive Self-Improvement Actually Is, and What It Is Not

The three gradations of recursive self-improvement, from the broad version already in use to the hard version that has not been demonstrated. (Source: Superintelligence)

"This isn't the same thing as an AI system completely autonomously updating its own code, but nevertheless this is a larval version of recursive self-improvement." Sam Altman, "The Gentle Singularity," June 2025

The phrase Recursive Self-Improvement gets thrown around loosely, so it helps to be precise. Recursive self-improvement is not the same as a model simply getting better. Almost everything in machine learning makes a model better in some sense. Training adjusts billions of parameters until predictions sharpen. Fine-tuning nudges a finished model toward a narrower task, like legal drafting or medical triage. Tool use lets a model call a calculator or a search engine. Agents chain those tools together to book a flight or refactor a file. None of that is recursive self-improvement, because none of it changes the process by which the next, more capable version of the model gets built.

The distinction is between improving the object and improving the thing that produces the object. A model that writes a sharper marketing email has improved an output. A model that writes a tool which makes future training runs faster, filters training data more cleverly, or catches a whole class of bugs in the codebase that trains the next model, has reached into the machinery of its own development. That second kind of improvement is the one that can compound, because each turn of the wheel makes the next turn easier. The "recursive" in the name points at exactly that: a system getting better at the act of getting better.

It is also worth separating recursive self-improvement from AutoML, the now-mature practice of automating parts of the model pipeline, such as architecture search or hyperparameter tuning. AutoML automates a slice of the work within a fixed human-designed frame. The harder concept points at something broader: an AI system participating across the whole research and development loop, from forming a hypothesis to designing an experiment, running it, reading the results, and deciding what to try next. As IEEE Spectrum put it in a recent survey of the field, researchers have spent decades assembling the pieces, from evolutionary algorithms to AutoML to large language models that write the code for their own successors, and the elements are now converging (IEEE Spectrum, 05/2026).

Three gradations are useful to keep in mind for the rest of this piece. The broad version is any AI-assisted improvement of AI development, which is already everywhere. The middle version, and the one that matters most in 2026, is AI automating significant chunks of AI research and engineering, where that automation feeds the next model. That middle version has now been demonstrated in bounded engineering and outcome-gradable research settings, but not across the full frontier-model pipeline. The narrow, hard version is a system that designs and trains its successor with little human input. The hard version has not yet been publicly demonstrated, and most of the confusion in the current debate comes from treating all three as if they were the same thing.

logo

Subscribe to Superintel+ to read the rest.

Become a paying subscriber of Superintel+ to get access to this post and other subscriber-only content.

A subscription gets you:

Discord Server Access

Participate in Giveaways

Saturday Al research Edition Access

Reply

Avatar

or to participate

Keep Reading