|
Dear friends,
The anti-AI coalition continues to maneuver to find arguments to slow down AI progress. If someone has a sincere concern about a specific effect of AI, for instance that it may lead to human extinction, I respect their intellectual honesty, even if I deeply disagree with their position. However, I am concerned about organizations that are surveying the public to find whatever messages will turn people against AI, and how the public reacts as these messages are spread by lobbyists or by politicians seeking to alarm constituents, companies pursuing regulatory capture or seeking to promote the power of their technology, and individuals seeking to gain attention or to profit by being provocative.
A large study (hat tip to the AI Panic blog) by a UK group tested different messages that are designed to raise alarm about AI. Their study found that saying AI will cause human extinction has largely failed. Doomsayers were pushing this argument a couple of years ago, and fortunately our community beat it back. But AI-enabled warfare and environmental concerns resonate better. We should be prepared for a flood of messages (which is already underway) arguing against AI on these grounds. Further, job loss and harm to children are messages that motivate people to act.
But when anti-AI propagandists take a one-sided view of complex issues to benefit their own organizations at the expense of the public at large — for instance, when big AI companies argue that AI is dangerous to block the free distribution of open source projects that compete with their offerings — then we all lose.
For example, public perception of data centers’ environmental impact is already far worse than the reality — data centers are incredibly efficient for the work they do, and hampering their buildout will hurt rather than help the environment. While job loss is a real problem, the “AI washing” of layoffs — in which businesses that had over-hired during the pandemic blame AI for recent layoffs, although AI hasn’t yet affected their operations — has led to overblown fears about the impact of AI on employment.
Unfortunately, this sort of propaganda easily leads to regulations that create worse outcomes for everyone. For example, oil companies worked for years to create fear of nuclear energy. The result is that overblown concerns about the safety of nuclear power plants has stifled nuclear power development, leading to millions of premature deaths from air pollution that was caused by other energy sources and a massive increase in CO2 emissions. Let’s make sure overblown concerns about AI do not lead to a similar fate for the many people that would benefit from faster AI development.
This week, the White House proposed a national legislative framework for AI. A key component is a federal preemption framework to prevent a patchwork of state regulations that hamper AI development. I support this.
After failing to gain traction at the federal level, a lot of anti-AI propaganda has shifted to the state level. If just one of the 50 states passes a law that limits AI in an unproductive way, it could lead to stifling AI development across all the states and potentially across the globe. The White House proposal rightfully respects each state’s rights to control its own zoning, how it enforces general laws to protect consumers, and how it uses AI. But if a state were to pass laws that limit AI development, federal rules would preempt the state law.
The White House proposal remains a proposal for now. However, if the U.S. Congress enacts it, it will clear the way for ongoing efforts to develop AI in beneficial ways.
Where do we go from here? Let’s support limiting applications — those that use AI, and those that don’t — that harm people. When the anti-AI coalition argues against AI, in addition to considering the merits of the argument, I consider whether their position is consistent and persuasive, or if they are just promoting whatever concerns they think will sway the public at a given moment. And, let’s also keep using a scientific approach to weighing AI’s benefits against likely harms, so we don’t end up with overblown concerns that limit the benefits that AI can bring everyone.
Keep building! Andrew
A MESSAGE FROM DEEPLEARNING.AI“Agent Skills with Anthropic” shows you how to make agents more reliable by moving workflow logic out of prompts and into reusable skills. Learn how to design and apply skills across coding, data analysis, research, and other workflows. Sign up here!
News
Open-Source Speed Demon
Nvidia, the dominant supplier of AI chips, released a competitive open-source large language model whose speed tops its size class — the first open-weights leader to come from the United States since last year, when Meta delivered Llama 4.
What’s new: Nvidia released Nemotron 3 Super 120B-A12B, a large language model designed for agentic applications, including not only weights but also training datasets and recipes. It is the second in a planned family of three: Nvidia released Nemotron 3 Nano-39B-A3B in December 2025, and Nemotron 3 Ultra-500B-A50B is forthcoming.
How it works: Nemotron 3 Super’s hybrid architecture interleaves mamba-2, attention, and modified MoE layers with multi-token prediction heads that generate a number of tokens per forward pass.
Performance: Nemotron 3 Super leads its size class in speed and processing long contexts, with competitive metrics in overall intelligence and agentic tasks.
Behind the news: Nvidia plans to invest $26 billion over five years to develop open-weights models — a substantial commitment. The announcement coincides with shifts in the open-weights landscape that could affect Nvidia’s business. Chinese companies, including Alibaba, Moonshot AI, and Z.ai, lately have built the most capable open-weights models, and they are building alternatives to Nvidia GPUs and Cuda software. For instance, DeepSeek has reportedly trained an upcoming model entirely on Huawei’s Ascend chips and Cann software.
Why it matters: Nemotron 3 Super gives developers a fast, fully open model for agentic applications, with training data, recipes, and tools alongside the weights. This openness also serves Nvidia’s business goals. Chinese open-weights models are growing more capable and increasingly streamlined to run on non-Nvidia chips, creating a risk that developers who previously relied on Nvidia will look elsewhere. Nemotron gives them a reason not to.
We’re thinking: Who better to optimize a model for GPUs than the company that designs the GPUs? From custom numerical formats to inference software, Nvidia can co-design hardware and software in ways that few model developers can match. Nvidia is betting that building models will help sell chips and vice versa.
OpenAI Tracks Agent States on AWS
OpenAI partnered with Amazon to build infrastructure for agents on the world’s largest cloud platform, a further sign that its close relationship with Microsoft is weakening.
What’s new: OpenAI and Amazon announced a “stateful runtime environment,” a forthcoming computing infrastructure designed for AI agents. The companies did not disclose the projected launch date. The partnership diversifies OpenAI’s cloud-computing resources beyond Microsoft Azure and lets Amazon use OpenAI models in its own products. As part of the deal, Amazon invested $15 billion in OpenAI with an additional $35 billion to come if certain undisclosed conditions are met, or if OpenAI offers it stock to the public prior to 2029, according to an analysis of related documents by GeekWire. Moreover, if the cloud partnership terminates, Amazon’s remaining $35 billion commitment will die with it. The investment was a part of a gargantuan $110 billion funding round that included Nvidia and Softbank and valued OpenAI at $730 billion. (Disclosure: Andrew Ng is a member of Amazon’s Board of Directors.)
How it works: Many developers interact with AI models through stateless APIs for which each request is independent. A developer sends a prompt, receives a response, and the model retains no memory of the exchange, so developers must pass all context into every request. The stateful runtime environment aims to handle that context, helping agents to execute long, multi-step workflows without losing track of where they are. In addition, customers will have access to customized versions of open-weights OpenAI models that run on AWS, The Information reported.
Behind the news: The partnership between OpenAI and Amazon marks the latest step in the dissolution of the tight cloud partnerships that defined the early generative AI era. In 2019, Microsoft invested $1 billion (which subsequently rose beyond $13 billion) in OpenAI and became its exclusive cloud provider. In 2023, Amazon invested up to $4 billion in Anthropic and became its primary cloud provider. Each deal paired an AI startup with a cloud giant. Both ties have since loosened.
Why it matters: Developers who build AI agents typically build their own state management, tool orchestration, and fault recovery on top of stateless APIs. A runtime environment that’s designed to handle these functions as infrastructure could lower the barrier to deploying AI agents. On the flip side, depending on exactly what state is stored and how portable it is, it may increase the cost to switch to a different cloud vendor. That it will run on AWS, the largest cloud provider by market share, will make it available to a wide swath of the developer community.
We’re thinking: Distinguishing between stateless and stateful may be clever legal engineering, but it also reflects a real technical shift. As AI applications move toward autonomy, the infrastructure behind agents may matter as much as the models.
Learn More About AI With Data Points!
AI is moving faster than ever. Data Points helps you make sense of it just as fast. Data Points arrives in your inbox twice a week with six brief news stories. This week, we covered MiniMax’s shift to a proprietary reasoning model and OpenAI’s shutdown of Sora. Subscribe today!
xAI’s Cost-Effective Video Generator
xAI launched a video generator that topped an independent quality ranking at a fraction of competitors’ prices.
What’s new: Grok Imagine 1.0 takes text with images and/or video, and produces video clips that can include dialogue, sound effects, and music.
Performance: Grok Imagine 1.0 debuted at the top of the Artificial Analysis Video Arena, a blind, head-to-head test of preferences judged by human viewers. It’s slower than some competitors but generally less expensive. (Disclosure: Andrew Ng has a personal investment in Artificial Analysis.)
Behind the news: Unlike video generators from Google, OpenAI, and Runway, which are available as standalone products and/or via APIs, Grok Imagine 1.0 is integrated with the X social network. This enables X users to generate and share video directly on X, where they have caused controversy. In late 2025, X users exploited Grok to produce nonconsensual sexualized images of real people, including children, resulting in investigations and bans in several countries. The phenomenon persisted after xAI promised to address it, Reuters reported.
Why it matters: Generating a video that matches your vision typically requires many iterations of adjusting prompts, regenerating, and comparing results. xAI says that early partners told the company that quality alone was not useful if latency and cost made iteration untenable. Third-party benchmarks show Grok Imagine 1.0 matches or exceeds leading models on quality at a lower cost than premium competitors, a combination that lowers the cost of experimentation.
We’re thinking: Image generation went from novelty to table stakes in roughly two years. Video generation is following a similar path. The seven-fold price gap between Grok Imagine 1.0 and the now-shuttered OpenAI Sora 2 Pro suggests that prices still have plenty of room to fall.
Context As An External Variable
When processing long contexts, large language models often lose track of details or devolve into nonsense. Researchers reduced these effects by managing context externally.
What’s new: MIT’s Alex L. Zhang, Tim Kraska, and Omar Khattab developed Recursive Language Models (RLMs) that process long prompts encountered in books, web searches, and codebases by offloading prompts to an external environment and managing them programmatically.
Key insight: A language model can process long inputs, including inputs larger than its context window, by treating input text as a persistent variable in an external programming environment. The model can write code to fetch only the necessary chunks of text. For example, it can look for keywords and retrieve the paragraphs that surround them. Writing code Iteratively enables the model to break down long-context tasks into sub-tasks before approaching the tasks as a whole.
How it works: RLMs read and manipulate tasks (a user’s prompt and associated documents) using Python code execution in a simple read-evaluate-print loop (REPL) environment. The tasks involved analyzing, understanding, or retrieving details from long documents. The model generated a program that invoked new instances of itself, or submodels, to handle each subtask and fed each instance’s output back into the root model.
Results: The authors compared RLMs based on Qwen3-8B, GPT-5 with medium reasoning, and Qwen3-Coder-480B to the original models using using benchmarks that involve retrieval and reasoning over documents up to 1 million tokens long. They also compared the RLMs to CodeAct agents with retrieval tools and custom agents that compacted or summarized context. The RLMs significantly outperformed both the stock models and other agentic strategies on tasks that require understanding of multiple documents, up to 11 million tokens total.
Why it matters: Earlier approaches often handle long contexts by using retrieval or summarization, which can lose critical details. By decomposing tasks into recursive sub-calls, a model can maintain high precision across more tokens. This method provides a blueprint for building agents that can reason coherently over numbers of tokens that far exceed a model’s input limit.
We’re thinking: An RLM pays attention only to the parts of the context it needs at any given moment. This approach seems akin to the human method of processing long documents one section at a time.
Work With Andrew Ng
Join the teams that are bringing AI to the world! Check out job openings at DeepLearning.AI, AI Fund, and Landing AI.
Subscribe and view previous issues here.
Thoughts, suggestions, feedback? Please send to thebatch@deeplearning.ai. Avoid our newsletter ending up in your spam folder by adding our email address to your contacts list.
|