Dear friends,
We’re organizing a new event called Buildathon: The Rapid Engineering Competition, to be held in the San Francisco Bay Area on Saturday, August 16, 2025! You can learn more and apply to participate here.
AI-assisted coding is speeding up software engineering more than most people appreciate. We’re inviting the best builders from Silicon Valley and around the world to compete in person on rapidly engineering software.
Code autocompletion by GitHub Copilot was cutting-edge 2 years ago, but it’s nowhere near what is possible now! For example, my team AI Fund routinely goes from a product idea to a basic working product or prototype in hours. This is why overcoming the Product Management Bottleneck — deciding what to build rather than the actual building — occupies a growing part of our effort.
DeepLearning.AI and AI Fund are organizing this Buildathon competition to see how quickly the best developers can build products. We’ll provide a loose product spec, say on a Real-Time Multiplayer Code Editor or Personal Finance Tracker (see above). Historically, these products may have taken a team of 2 or 3 engineers weeks or months to build. But we hope participants will be able to build them in closer to 60 minutes. You can read more about the competition format here.
Keep building! Andrew
A MESSAGE FROM DEEPLEARNING.AIIn our course Retrieval Augmented Generation, you’ll build RAG systems that connect AI models to trusted, external data sources. This hands-on course covers techniques for retrieval, prompting, and evaluation to improve the quality of your applications’ output. Get started now
News
Powers Realign in AI-Assisted Coding
A $3 billion bid by OpenAI to acquire Windsurf, maker of the AI-assisted integrated development environment of the same name, collapsed at the 11th hour, setting off a tumultuous few days of corporate maneuvering.
What’s new: Google licensed Windsurf’s technology for $2.4 billion and hired CEO Varun Mohan, co-founder Douglas Chen, and an unknown number of key engineers. Cognition AI, maker of the Devin agentic coding system, purchased what remained for an undisclosed sum. OpenAI was left empty-handed.
How it works: AI-assisted coding tools are boosting software engineering productivity, accelerating development cycles, and finding bugs and security vulnerabilities. As a leader in the field, Windsurf became a target for acquisition.
Behind the news: Google’s hiring of Windsurf’s leadership and access to its technology in return for a large licensing fee mirrors its earlier arrangement with Character.AI. Such deals between AI leaders and startups have become increasingly common as AI companies seek quick advantages without the risk that regulators might delay or quash an outright acquisition, while AI startups seek infusions of cash to support the building of cutting-edge models. Other deals of this sort have involved Meta and Scale AI, Amazon and Adept, and Microsoft and Inflection.
Why it matters: AI-assisted coding is hot! Google recently launched Gemini Code Assist and Gemini CLI, competing with Amazon Kiro, Anthropic Claude Code, Microsoft’s GitHub Copilot, Replit Ghostwriter, and others. Expertise and technology from Windsurf may help it pull ahead. Meanwhile, Cognition’s 2024 release of Devin pioneered agentic coding, but since then competitors have taken the spotlight. Cash from Google gives the company a chance to regroup. As for OpenAI, there are other great makers of AI-assisted tools to negotiate with.
We’re thinking: Windsurf’s Anshul Ramachandran teaches a short course on agentic coding. Check it out for a peek at the technology Google deemed worth $2.4 billion.
Born to Be Agentic
An agent’s performance depends not only on an effective workflow but also on a large language model that excels at agentic activities. A new open-weights model focuses on those capabilities.
What’s new: Beijing-based Moonshot AI released the Kimi K2 family of 1 trillion-parameter large language models (LLMs). The family includes the pretrained Kimi-K2-Base and Kimi-K2-Instruct, which is fine-tuned for core agentic tasks, notably tool use. Bucking the recent trend in LLMs, Kimi K2 models are not trained for chain-of-thought reasoning.
How it works: Moonshot pretrained the models on 15.5 trillion tokens from undisclosed sources. It fine-tuned Kimi-K2-Instruct via reinforcement learning using a proprietary dataset.
Results: Moonshot compared Kimi-K2-Instruct to two open-weights, non-reasoning models (DeepSeek-V3 and Qwen3-235B-A22B with reasoning switched off) and four closed, non-reasoning models.
Behind the news: Third-party vendors have been quick to implement Kimi-K2-Instruct.
Why it matters: Demand is growing for LLMs that carry out agentic workflows accurately, as these workflows lead to better performance. Kimi-K2-Instruct gives developers a strong option for fine-tuning models for their own agentic tasks.
We’re thinking: Early LLMs were built to generate output for human consumption. But the rise of agentic workflows means that more and more LLM output is consumed by computers, so it makes good sense to put more research and training effort into building LLMs that generate output for computers. A leading LLM optimized for agentic workflows is a boon to developers!
Learn More About AI With Data Points!
AI is moving faster than ever. Data Points helps you make sense of it just as fast. Data Points arrives in your inbox twice a week with six brief news stories. This week, we covered how an autonomous OpenAI system placed second in an international coding contest, finishing just behind a top human programmer. Subscribe today!
How to Comply With the EU’s AI Act
The European Union published guidelines to help builders of AI models to comply with the AI Act, which wasenacted last year.
What’s new: The General Purpose AI Code of Practice outlines voluntary procedures to comply with provisions of the AI Act that govern general-purpose models. Companies that follow the guidelines will benefit from simplified compliance, greater legal certainty, and potentially lower administrative costs, according to EU officials. Those that don’t must comply with the law nonetheless, which may prove more costly. While Microsoft, Mistral, and OpenAI said they would follow the guidelines, Meta declined, saying that Europe is “heading down the wrong path on AI.”
How it works: The code focuses on “general-purpose AI models” that are capable of performing a wide range of tasks.
Behind the news: The AI Act is the product of years of debate and lobbying among scores of stakeholders. EU technology official Henna Virkkunen called the AI Act “an important step” in making cutting-edge models “not only innovative but also safe and transparent.” However, companies and governments on both sides of the Atlantic have asserted that the law goes toofar. In May, the EU moved to relax some provisions, including language that would allow users to sue AI companies for damages caused by their systems. Earlier this month, 44 chief executives at top European companies asked European Commission President Ursula von der Leyen to postpone the AI Act’s rules that govern general-purpose models for two years.
Why it matters: The AI Act is the most comprehensive and far-reaching set of AI regulations enacted to date, yet it remains highly contentious and in flux. The commitments by Microsoft, Mistral, and OpenAI to follow the code mark a significant step in the act’s circuitous path to implementation, but also an increase in bureaucracy and potential for regulatory capture. Their endorsement could persuade other big companies to sign on and weaken further efforts to loosen the act’s requirements.
We’re thinking: From a regulatory point of view, the notion of systemic risk is misguided. Limiting the inherent risk of AI models is as helpful as limiting the inherent risk of electric motors, which would result only in relatively useless motors. We hope for further revisions in the AI Act that relieve burdens on builders of foundation models, especially open source projects, and address practical risks of specific applications rather than theoretical risks of their underlying technology.
Agentic System for Harder Problems
LLMs can struggle with difficult algorithmic or scientific challenges when asked to solve them in a single attempt. An agentic workflow improved one-shot performance on hard problems both theoretical and practical.
What’s new: Alexander Novikov, Ngân Vũ, Marvin Eisenberger, and colleagues at Google built AlphaEvolve, an agentic system that used LLMs to generate code in an evolutionary process. AlphaEvolve solved longstanding math problems and helped to reduce the training time for one of Google’s Gemini large language models.
Key insight: When we’re using an LLM to solve a difficult problem, it’s often more effective to start with a working version and gradually improve it than to generate a solution in one shot. By making small, targeted modifications and keeping only those that perform best under automated evaluation, this iterative process can solve problems that LLMs often can’t solve directly. Google used this idea in its earlier FunSearch, which used an LLM to evolve individual Python functions. This approach has become more powerful as LLMs have improved, and today it can benefit more difficult problems.
How it works: AlphaEvolve implemented an evolutionary loop: Given initial code and evaluation code, Gemini 2.0 Flash and Gemini 2.0 Pro suggested changes, stored the revised program in a database, evaluated it, suggested further changes, and repeated the process.
Results: AlphaEvolve achieved breakthroughs in both math and software engineering.
Why it matters: AlphaEvolve proposes thousands of candidate ideas — some bad, some brilliant — to evolve better programs. The authors show that this approach can improve algorithms that have stood for decades as well as computing infrastructure designed by Google engineers. Thus, AlphaEvolve adds to the growing evidence that LLMs can act as collaborators in cutting-edge research, exploring broad problem spaces and finding novel solutions. Other examples include Co-Scientist and SWE-agent.
We’re thinking: Relatively simple evaluations enabled the authors’ agentic evolutionary system to gradually improve. More broadly, evaluations are proving to be important to a wide variety of agentic workflows.
Work With Andrew Ng
Join the teams that are bringing AI to the world! Check out job openings at DeepLearning.AI, AI Fund, and Landing AI.
Subscribe and view previous issues here.
Thoughts, suggestions, feedback? Please send to thebatch@deeplearning.ai. Avoid our newsletter ending up in your spam folder by adding our email address to your contacts list.
|