On Every

Our New Incubation Raised $3.6 Million to Teach AIs to Play Games

General Catalyst and Inovia partnered with us to fund Good Start Labs

Midjourney/Good Start Labs/Every illustration.

TL;DR: Good Start Labs is spinning out of Every as a separate company to teach AI models to play games and generate high-quality reinforcement learning data for frontier labs. Alex Duffy, our head of AI training, built it starting with AI Diplomacy—where models battled for world domination—and will run the company.

Was this newsletter forwarded to you? Sign up to get it in your inbox.

Today, I’m excited to announce a new Every incubation: Good Start Labs, run by our head of AI training Alex Duffy. It’s spinning out of Every as a separate company with $3.6 million in funding from General Catalyst, Inovia, Every, and a group of angel investors from top-tier AI labs like DeepMind.

Good Start teaches AI models to play games in order to generate high-quality reinforcement learning data for frontier labs. The company builds custom game environments and partners with popular existing games, allowing millions of players to naturally evaluate and train AI models just by playing.

It started as AI Diplomacy, a project where we taught frontier models to battle each other for world domination in the game Diplomacy. Alex and his co-founder Tyler Marques spent months building it, and when we launched it on Every in June, it went viral. Over 60,000 people read our story about it, and almost 50,000 people watched the models play live on Twitch. Researchers from top AI labs and academia immediately realized how valuable it was. See Exhibit A:

Researcher Andrej Karpathy (whose work inspired AI Diplomacy) on the project’s launch day. (Source: X/Dan Shipper.)

Many also reached out to get access to the data. That’s when Alex knew there was a valuable and important business to be built.

Why game data is the next frontier for training data

Games are a dynamic, multiplayer environment, and as such, they are a rich and valuable way to generate training data.

AI Diplomacy, for example, asks models to plan over short and long time horizons, lie, cheat, and betray each other to win the game. This behavior makes it an ideal source of training data on model alignment and deception—much more so than static data sets.

And AI Diplomacy data is just beginning: Teaching models to play games against each other and other humans can generate data to help them in every frontier task—like coding, writing, or learning. As ex-DeepMind researcher Kelly Clancey writes, “Play is to intelligence what mutation is to evolution.”

What’s next for Good Start Labs

With this new funding, Good Start labs is launching its first major partnership with Bad Cards, a Discord-based Cards Against Humanity game in which players fill in absurd prompts with outrageous answers and a rotating judge picks the funniest one with millions of monthly players.

By teaching frontier models to play Bad Cards, they’re using player feedback about what is and isn’t funny to build LOL Arena: a leaderboard of the funniest frontier models. With this leaderboard they’ll be able to generate high-quality, labeled training data to make future models more humorous and better aligned. This is important for use cases that require AI to be funny—for example, new content formats like Sora. Effective humor is a complex act that requires intelligence, emotional intelligence, and world knowledge, all of which can bolster model capabilities beyond joke writing.

Good Start labs is also doubling down on Diplomacy with Diplomacy Arena, their AI Diplomacy environment (also launched in June). It is one of the best pure LLM environments for teaching long- and short-horizon reasoning.

They’re publishing three leaderboards based on the data generated from Diplomacy Arena: Performance (which models win), Betrayal (which ones backstab allies to achieve victory), and Prompt Impact (how much a prompt changes their behavior).

You should expect more games and partnerships with big labs to come soon.

What comes next for Every

Usually when we incubate a product, it stays within the company and becomes part of the Every subscription. Every once in a while, though, it makes sense to spin it out as a separate venture-backed company. We’re remaining involved as both investor, advisor, and shareholder.

This is the second venture-backed incubation we’ve done—Lex was the first, which spun out and raised $2.75 million from True Ventures. I’m extremely proud that we can create companies like these inside of Every’s ecosystem. Much of the thinking that became AI Diplomacy and then Good Start was published on Every—and it’s one of my favorite examples of how great thinking can turn into great writing, which can turn into valuable businesses.

According to Sequoia, great founders should be judged by slope not intercept, and that may as well have been written about Alex. He’s one of the hardest-working and fastest-learning founders I’ve encountered, and I’m extremely excited to partner with him on Good Start.

Dan Shipper is the cofounder and CEO of Every, where he writes the Chain of Thought column and hosts the podcast AI & I. You can follow him on X at @danshipper and on LinkedIn, and Every on X at @every and on LinkedIn.

We build AI tools for readers like you. Write brilliantly with Spiral. Organize files automatically with Sparkle. Deliver yourself from email with Cora. Dictate effortlessly with Monologue.