In partnership with

Dear Readers,

Open source just landed a haymaker on the frontier labs. Moonshot AI's Kimi K2.6 is now outscoring GPT-5.4 and Claude Opus 4.6 on SWE-Bench Pro, and it is fully open source, which means the comfortable moat that closed-model companies have relied on is eroding faster than anyone predicted.

But the giants are not sitting still: Google has assembled a dedicated strike team inside DeepMind to close the coding gap with Anthropic, OpenAI is preparing a new image model that could blur the line between AI-generated and real photography, and Alibaba's Qwen team just dropped a preview that signals China's AI ambitions are accelerating on multiple fronts. Meanwhile, Anthropic published fascinating research into whether Claude actually "feels" something when it talks to you, and Jeff Bezos managed to land a reusable rocket booster on his third try, only to lose a satellite worth hundreds of millions in the same flight.

Today's issue covers the full spectrum, from the code wars reshaping the industry to the Moon race heating up in the Atlantic, so grab your coffee and scroll on.

All the best,

🤖 Qwen Preview Pushes Model Forward

QwenTeam introduces Qwen3.6-Max-Preview, highlighting notable gains in agentic coding, world knowledge, and instruction-following compared to its predecessor, Qwen3.6-Plus. Early benchmarks suggest meaningful improvements across technical tasks and reasoning, though the model remains actively evolving, signaling more refinements ahead.

The preview is positioned as a hosted proprietary model via Alibaba Cloud Model Studio, with API access forthcoming, hinting at broader developer integration as capabilities mature.

The only caveat: I would have preferred to see a comparison to Opus 4.7 in the benchmarks instead of Opus 4.5.

Learn More

🧠 Google Races to Fix Coding AI

Google has formed a dedicated strike team within Google DeepMind to improve its AI coding models, driven partly by pressure from Anthropic, whose tools are seen internally as more advanced. Leadership, including Sergey Brin, is pushing urgently toward “agentic” AI systems capable of handling complex, multi-step coding tasks and eventually automating AI research itself.

The effort reflects a broader industry shift—also involving OpenAI—toward prioritizing code generation as a strategic frontier, with Google doubling down on internal tools and training data to close a gap where its AI writes about 50% of code, compared to Anthropic’s near-total reliance on AI-generated output.

Learn More

🖼️ OpenAI Sharpens Image AI Race

OpenAI is preparing a powerful new image model, reportedly called “gpt-image-2,” capable of producing highly photorealistic visuals and complex diagrams, a leap that could blur the line between AI-generated and real images. The move comes amid rising competition from Google and Anthropic, with OpenAI aiming to reignite user growth and push past its stalled ~920 million weekly users milestone.

Beyond viral image trends, the model’s improved text rendering and design precision signals broader commercial potential, from advertising to education, hinting that the next phase of AI competition may hinge as much on visual capability as coding prowess.

Learn More

Anthropic found that Claude draws on emotion-like representations learned from training text to inhabit its assistant role, and these internal states meaningfully influence how it responds, writes code, and makes decisions.

Instead of a quote of the day, we have a meme of the day. Sometimes you just have to lift the mood :)

— # (#)

Open Source Just Beat the Giants

❝

The Takeaway

👉 Kimi K2.6 outperforms GPT-5.4 and Claude Opus 4.6 on SWE-Bench Pro, making it the strongest open-source coding model available right now.

👉 The Agent Swarm architecture, scaling to 300 parallel sub-agents, enables autonomous multi-hour engineering sessions that most proprietary models cannot replicate at comparable cost.

👉 Vercel, Fireworks, and Ollama are already integrating K2.6, signaling real enterprise adoption rather than just benchmark hype.

👉 Moonshot AI's rapid release cadence, from K2 (July 2025) to K2.5 (January 2026) to K2.6 now, puts sustained pressure on labs relying on closed-model moats to justify premium pricing.

Moonshot AI just made the open-source crowd very loud. The Beijing-based lab released Kimi K2.6, an open-source model built on a trillion-parameter Mixture-of-Experts architecture that goes toe-to-toe with GPT-5.4 and Claude Opus 4.6 on coding and agentic benchmarks.

On SWE-Bench Pro, K2.6 scores 58.6%, edging past both GPT-5.4 (57.7%) and Claude Opus 4.6 (53.4%). Its Agent Swarm architecture now scales to 300 sub-agents running 4,000 coordinated steps in parallel, a 3x jump over its predecessor K2.5. In one demo, K2.6 autonomously optimized a financial matching engine over 13 hours, executing more than 1,000 tool calls and boosting throughput by 185%. The model also handles full-stack workflows, turning single prompts into production-ready websites with databases and authentication built in. Vercel reports over 50% improvement on its Next.js benchmark.

With K3 already rumored to target 3 to 4 trillion parameters, Moonshot is showing that the gap between open-source and closed frontier models is closing fast. The question now: how long before open-source isn't just competitive, but maybe even the default?

Why it matters: Kimi K2.6 proves that open-source models can match or beat the best closed-source systems on real-world coding and agent tasks. This shifts leverage toward developers and enterprises who want frontier performance without vendor lock-in.

Sources:
🔗 https://www.kimi.com/blog/kimi-k2-6

IN 2 DAYS: Become a Head of AI in a Day | Free Virtual Event

Join Section on 4/23 for an afternoon of free AI transformation workshops designed to turn you into a winning AI leader. You'll get frameworks for driving org-wide AI adoption, proficiency, and workflow automation, and case studies from real enterprise leaders who’ve already gotten results.

Claim Your Free Spot

The Moon Race Just Got Real

Jeff Bezos just landed a used rocket booster on a floating platform in the Atlantic Ocean, and then lost a satellite worth hundreds of millions of dollars. Welcome to Blue Origin in 2026. On Sunday, the company's massive New Glenn rocket launched from Cape Canaveral for only its third flight ever, carrying a previously flown booster nicknamed "Never Tell Me the Odds.”

The booster separated, fired its engines in reverse, and touched down on the droneship Jacklyn about ten minutes after liftoff. Musk himself replied "Congrats" on X. But here's the twist nobody saw coming: the upper stage malfunctioned, dumping AST SpaceMobile's giant BlueBird 7 communications satellite into the wrong orbit. The satellite had to be deorbited. Gone. What makes this more than just another rocket story is what New Glenn is actually built for. Blue Origin's lunar lander just finished extreme temperature testing and is now at Kennedy Space Center, preparing for an uncrewed Moon landing at the south pole later this year.

NASA has restructured its entire Artemis program, and Bezos is now a real contender to fly astronauts to the Moon by 2028. SpaceX needed 32 flights before reusing an orbital booster successfully. Blue Origin pulled it off in three. The rocket works. Now Bezos needs the rest of it to work, too, before his Moon window opens.

Learn More

Hiring in 8 countries shouldn't require 8 different processes

This guide from Deel breaks down how to build one global hiring system. You’ll learn about assessment frameworks that scale, how to do headcount planning across regions, and even intake processes that work everywhere. As HR pros know, hiring in one country is hard enough. So let this free global hiring guide give you the tools you need to avoid global hiring headaches.

Download the free guide today

K2.6 Makes Open Source Scary (good)

🤖 Qwen Preview Pushes Model Forward

🧠 Google Races to Fix Coding AI

🖼️ OpenAI Sharpens Image AI Race

Open Source Just Beat the Giants

IN 2 DAYS: Become a Head of AI in a Day | Free Virtual Event

The Moon Race Just Got Real

Hiring in 8 countries shouldn't require 8 different processes

Reply

Keep Reading

Superintelligence.