
In Today’s Issue:
👁️ Alibaba drops a massive omnimodal model
🔐 Google introduces a responsible disclosure framework to tackle quantum-computing vulnerabilities
💻 Anthropic's Claude gains full end-to-end computer control
📉 OpenAI’s ChatGPT app store is stumbling six months after launch.
✨ And more AI goodness…
Dear Readers,
Alibaba just dropped a model that watches your screen, listens to your voice, and writes code from what it sees, and most of Western tech media didn't even notice. Qwen3.5-Omni isn't another benchmark-chasing paper release; it's a fully omnimodal system that perceives text, images, audio, and video simultaneously, outperforming Gemini 3.1 Pro on audio-video understanding while covering 113 languages in speech recognition.
That alone would make today's issue worth your time, but we're just getting started. Google is quietly building a framework for responsible quantum-crypto disclosure, OpenAI's ChatGPT app store is stumbling six months in with developers frustrated and consumers unconvinced, and Anthropic's Claude just learned to control your entire computer end-to-end.
Meanwhile, robots had their biggest week in history, Zoox is expanding to Austin and Miami, Amazon acquired a stair-climbing delivery bot from ETH Zurich, and a German robotics startup is raising a billion euros backed by Tether. Oh, and you might want to check your axios dependency before you do anything else today. Let's dive in.
All the best,

Kim Isenberg



🔐 Google Tackles Crypto Quantum Risks
Google outlines a careful strategy for disclosing quantum-related vulnerabilities in cryptocurrencies, aiming to avoid both secrecy and unnecessary alarm. It backs its claims with verifiable zero-knowledge proofs while withholding sensitive technical details, and stresses that exaggerated quantum threat narratives can be just as damaging as real attacks.
The approach blends responsible disclosure with trust-building, highlighting ongoing progress toward post-quantum security without fueling fear or speculation.
🚀 Claude Gains Full Computer Control
Anthropic’s Claude now takes automation to the next level, writing code, compiling apps, interacting with GUIs, debugging issues, and verifying fixes all in one flow. This research preview (macOS, Pro/Max plans) unlocks true end-to-end development and testing without leaving the CLI, potentially saving hours of manual work.
This means faster iteration, real UI testing without extra tooling, and automation of previously “untouchable” GUI workflows - huge productivity gains if it scales reliably.
Anthropics shipping speed is insane. Each day a new release.

📉 ChatGPT App Store Struggles Early
Six months in, OpenAI’s push to build an app ecosystem inside ChatGPT is off to a slow start, with 300+ integrations hard to find and limited by partners reluctant to hand over payments and customer relationships. Developers report buggy tools, slow approvals, and almost no analytics, while companies like Booking Holdings and StubHub say the platform drives minimal traffic.
Consumers are still experimenting rather than committing, most continue to rely on search, apps, and websites, and over half remain cautious about sharing payment details with AI, underscoring how far ChatGPT is from becoming an everyday transaction hub.


axios may be compromised via a malicious new dependency,pin your version, audit lockfiles, and halt upgrades immediately because this could impact millions of installs.


NVIDIA Nemotron Unpacked: Build, Fine-Tune, and Deploy Open Models From NVIDIA



China's AI Just Got Eyes, Ears, and a Voice
The Takeaway
👉 Trained on over 100 million hours of native multimodal audio-video data, the model handles more than 10 hours of audio input and supports a 256K token context window - making it serious infrastructure, not just a research release.
👉 Its ARIA technology dynamically aligns text and speech output to prevent dropped words and unclear numbers - the kind of mundane-but-critical fix that separates demo-ware from production-ready tools.
👉 On multilingual voice stability benchmarks, Qwen3.5-Omni-Plus beat ElevenLabs, GPT-Audio, and Minimax across 20 languages - putting it in direct competition with dedicated voice platforms, not just general-purpose models.
👉 The "Audio-Visual Vibe Coding" capability - generating functional code from video and audio input alone — represents a concrete step toward AI that operates inside workflows autonomously rather than waiting for text instructions.
Alibaba's Tongyi Lab dropped something remarkable yesterday, and it barely made a ripple in Western tech media, which is exactly why you should pay attention.
Qwen3.5-Omni is an omnimodal model capable of understanding text, images, audio, and video, and can generate not only text but also speech, with audio and video understanding that surpasses Gemini 3.1 Pro. This is an AI that doesn't just read the world, it genuinely perceives it. However, a small caveat: it doesn’t generate images or videos itself but interpretes them.

The architecture underneath is clever. The model uses a "Thinker-Talker" design, where the Thinker handles understanding and the Talker manages expression, both upgraded to Hybrid-Attention Mixture of Experts - achieving 215 state-of-the-art benchmark results while maintaining strong performance across modalities.
What makes this practically exciting is "Audio-Visual Vibe Coding": the model can watch a screen recording or video of a coding task and write functional code based purely on what it sees and hears - no text prompt required. That's not a demo trick. That's a glimpse of how AI assistants might soon live inside your workflow.

Speech recognition now covers 113 languages and dialects, up from 19 in the predecessor, and speech generation expanded from 10 languages to 36. And for anyone who's ever fought with a voice assistant that stops mid-sentence because a dog barked: Qwen3.5-Omni adds semantic interruption, which distinguishes between a user genuinely wanting to interject and ambient background noise.
This is a signal. China's open-source AI ecosystem is building infrastructure that competes directly with the most capable closed systems in the West. The question isn't if this matters. It's whether the West is watching closely enough. Let’s wait for “Mythos” and “Spud” to see how they reply.
Why it matters: Qwen3.5-Omni shows that truly native omnimodal AI - one model handling text, vision, audio, and video simultaneously - is no longer a frontier research concept but a deployable product. For anyone building voice-first applications or agentic workflows, Alibaba just became a serious option.
Sources:
🔗 https://qwen.ai/blog?id=qwen3.5-omni


Protect online privacy from the very first click
Your digital footprint starts before you can even walk.
In today’s data economy, “free” inboxes from Google and Microsoft, like Gmail and Outlook, are funded by data collection. Emails can be analyzed to personalize ads, train algorithms, and build long-term behavioral profiles to sell to third-party data brokers.
From family updates, school registrations, medical reports, to financial service emails, social media accounts, job applications, a digital identity can take shape long before someone understands what privacy means.
Privacy shouldn’t begin when you’re old enough to manage your settings. It should be the default from the start.
Proton Mail takes a different approach: no ads, no tracking, no data profiling — just private communication by default. Because the next generation deserves technology that protects them, not profiles them.



daVinci-LLM shows that smarter data curation - not bigger models - can deliver major reasoning gains while matching larger models at half the size.


Robots Hit the Real World
March 2026 just dropped one of the most action-packed weeks in robotics history, and it's not even close. Three major moves happened almost simultaneously, each one reshaping how we think about machines in the real world.

First up: Amazon-owned Zoox announced its biggest expansion ever, rolling out its purpose-built robotaxis to Austin and Miami while quadrupling its San Francisco coverage area. The company has nearly 2 million autonomous miles under its belt and over 350,000 passengers transported. The NHTSA is expected to decide on commercial approval as early as April 2026, and if greenlit, paid rides could start almost immediately.

Meanwhile, Amazon quietly acquired RIVR, a Zurich-based robotics startup spun out of ETH Zurich, known for building stair-climbing delivery robots, think of a dog on roller skates that carries your packages to your front door. Amazon has now deployed over one million robots across its operations network, and RIVR's quadrupeds could become the last-mile game-changer nobody saw coming.
On the software side, Munich-based Agile Robots landed a strategic partnership with Google DeepMind, integrating Gemini Robotics foundation models into industrial robots for sectors like electronics manufacturing, automotive, and logistics. And in the funding arena, German startup Neura Robotics is reportedly raising about €1 billion backed by stablecoin issuer Tether, valuing the company at roughly €4 billion — the largest single robotics round in European history.

The convergence of massive capital, real-world deployments, and AI-robotics partnerships in a single month signals that physical AI has crossed the threshold from prototype to infrastructure. This isn't a future scenario anymore, it's the present unfolding at breakneck speed!
Sources:
🔗 https://www.therobotreport.com/zoox-sets-geographic-milestones-product-features-robotaxi/
🔗 https://www.cnbc.com/2026/03/24/amazon-zoox-robotaxi-rides-austin-miami.html
🔗 https://techcrunch.com/2026/03/19/amazon-acquires-rivr-maker-of-a-stair-climbing-delivery-robot/


Learn AI in 5 minutes a day
This is the easiest way for a busy person wanting to learn AI in as little time as possible:
Sign up for The Rundown AI newsletter
They send you 5-minute email updates on the latest AI news and how to use it
You learn how to become 2x more productive by leveraging AI






