|
|
|
Some people just inherently understand their priorities in life, and now that they can code, are unleashing true beauty into the world: |
|
Happy Memorial Day Weekend to everyone who celebrates! Gonna keep it light today. |
Hereās what happened in AI today: |
š± OpenAIās model solved an 80-year math problem. š° OpenAI and Anthropicās revenue race got weird fast. š° Trump delayed an AI order as California prepared workers. šŖ Qwen 3.7 Max ran an agent for 35 hours. š Use Codex /goal mode for long tasks.
|
Hey: Want to reach 700,000+ AI-hungry readers? Advertise with us!Ā |
P.S: Love robots? Weāre starting a new robotics newsletter! Sign up early here. |
|
šĀ OpenAI says its model solved an 80-year math problem by⦠disproving it. |
Math has one perk for AI watchers: eventually, somebody checks the work. |
That makes OpenAIās new claim worth paying attention to. The company says an internal reasoning model apparently disproved the ErdÅs unit distance conjecture, a discrete geometry problem from 1946. If that all made you go āHuh?ā, keep scrolling. |
Hereās the basic explanation of the problem: if you place n points on a flat plane, how many pairs can sit exactly one unit apart? For decades, many mathematicians believed square-grid style patterns were basically the best possible answer. |
And yet, OpenAIās unreleased reasoning model apparently found a counterexample: a new infinite family of point arrangements that creates more unit-distance pairs than the old grid-based belief allowed. That means the model āsolvedā the problem by proving the conjecture was false. |
Hereās what happened: |
OpenAI said the original proof came from a general-purpose reasoning model, rather than a system specially trained, scaffolded, or targeted for this problem. The proof shows infinitely many point sets with at least n1+Ī“ unit-distance pairs. That beats ErdÅsās old n1+o(1) conjecture, which roughly meant āonly a tiny bit better than linear.ā External mathematicians published companion remarks verifying and explaining the result. Princeton mathematician Will Sawin sharpened it, showing more than n1.014 unit-distance pairs for arbitrarily large point sets.
|
Why this matters: This is a cleaner test of AI reasoning than a benchmark (a standardized model test). Benchmarks can reward lucky guesses. A proof has to survive expert review, line by line. |
The proof used algebraic number theory (math about number systems), including class field towers and Golod-Shafarevich theory, to crack a geometry problem that sounds simple. |
TechCrunch noted an earlier OpenAI ErdÅs claim fell apart after the model surfaced existing results. This time, outside mathematicians signed the companion remarks, including some critics of that previous episode. OpenAI turned their haters into benchmarks, basically. |
Elliot Glazer added an interesting POV on this too: AI may surface answers humans could have found, but didnāt have time (or the will) to go after because it didnāt seem worth finding. Only so many experts can spend years attacking a problem the field doubts exists in the first place. |
Our take:Ā Think about the loop here: the model found the weird route, humans checked the work, Codex helped clean up the write-up, and Princetonās Will Sawin showed the constructionās edge compounds at huge scale, which is why the result matters beyond āAI found a math trick.ā |
Math is unusually AI friendly because proofs can be checked. Biology, medicine, and business strategy have messier feedback loops. Greg Kamradt of ARC Prize recently shared a nice breakdown of the 7 levels of verifiability that tracks how hard things are to verify on a spectrum due to the length of time it takes to get āfeedbackā on if your actions led to the outcome you want. Read our deep dive on the topic here. |
|
|
|
|
The Vanta Agent is the sharpest GRC engineer youāve never had to hire, working tirelessly across the platform to draft policies, complete questionnaires, and flag issues before they escalate. |
Fast-moving companies like Ramp and Cursor use Vanta to get and stay compliant, simplify their audit process, and unblock dealsāso teams can get back to building. |
Ready to learn more? Watch the on-demand demo to see how Vanta works.Ā |
|
|
Long agent tasks fail when the AI forgets what ādoneā means. Codexās /goal mode fixes that by giving it a persistent objective it can keep checking as it works. |
Use this for tasks with many steps: migrations, refactors, audits, bug sweeps, or report generation. The trick is to write the goal like a mini contract: outcome, constraints, and tests. If /goal does not appear, OpenAI says you can enable features.goals in config.toml or run codex features enable goals. |
Try this: |
/goal
Audit this project for newsletter draft readiness.
Definition of done:
1. Every section has the required header.
2. Every hyperlink is attached to a short, natural anchor.
3. No Treats or Around the Horn bullets use bold text.
4. Every technical term has a plain-English parenthetical on first use.
5. Return a short report with pass/fail status and exact fixes made.
Before editing, make a checklist. After editing, run the checklist again and show me what changed.
|
Total AI beginner? Start here (goes with this video). |
Have a specific skill you want to learn?Ā Request it here.Ā |
|
|
|
|
|
|
 | Click the image above to watch on YouTube! |
|
ICYMI: Ben Cherry of LiveKit joined us on The Neuronās weekly livestream to show us how to build real-time voice agents that can listen, interrupt, call tools, and run in production. And guess what? Itās so easy, even an agent can do it! You can literally grab the transcript from Google (click ā⦠moreā under the vid description, scroll down to click āShow Transcriptā, then copy the transcript and give it to your Codex / Claude to set up for you). |
Itās a super fun episode; Ben shows how to launch an agent via LiveKite (and what code repos to use if that kinda thing doesnāt intimidate you), edited his agent live with Claude Code, and even cloned his own voice for us live. Click here to watch. |
|
š° Around the Horn |
OpenAI reportedly generated about $5.7B in Q1, nearly $1B ahead of Anthropic, while Anthropic is projected to more than double to $10.9B in Q2. California signed a first-in-the-nation order to prepare workers and small businesses for AI disruption. xAIās Grok reportedly flopped with U.S. government buyers, with Reuters finding only three identified federal use cases. Intuit planned to lay off 3,000+ workers, about 17% of its workforce, to simplify the company and refocus on AI products. Remember Mondayās story? Samsung will reportedly distribute about $26.6B in bonuses to chip workers, averaging roughly $340K per employee. Thatās how important chips are! NVIDIA CEO Jensen Huang said CPUs built for AI agents could become a new $200B market for the company. Taiwan sought to detain three people accused of forging documents to smuggle NVIDIA AI chips to China, Hong Kong, and Macau.
|
|
|
Your next great hire lives in Slack. |
|
Viktor is an AI coworker that connects to your tools and ships real work. Ask Viktor to pull a report, build a client dashboard, or source 200 leads matching your ICP. Most teams hand over half their ops within a week. |
Add Viktor to Slack for free. |
|
š” Intelligent Insights |
Data filtering: this research paper argues larger models can benefit from messy data that smaller models cannot use well (basically scaling laws for data). AI oversight: the U.K. AI Security Institute warned that todayās oversight methods may degrade as models get more capable. Arenaās frontier: Arena.ai found GPT-4-level model quality is now roughly 500x cheaper than it was in 2023 (and 4 other insights you might like!). Accessible for AI: a sharp TechPolicy Press critique of llms.txt, MCP servers, and other machine-readable web infrastructure that may leave disabled users behind. If you read anything today, read this: After Automation from Dan Shipper argued AI creates more work for humans by flooding the world with generic output and raising the value of taste, context, and judgment. Iāll end the newsletter with this insight:
|
 | Source: Dan Shipper @ Every; image linked to the article |
|
|
|
 | Donāt say we never highlight negative feedback! |
|
|
| Thatās all for now. | | What'd you think of today's email? | |
|
|
P.P.S: Love the newsletter, but only want to get it once per week? Donāt unsubscribeāupdate your preferences here. |