|
In partnership with |  |
| |
Real quick before the news: we’re trying to get The Neuron’s YouTube channel across 20K subscribers this weekend. | This is our most advanced growth strategy yet: asking nicely for you to click the big Subscribe button below, and hoping you will do us the very sincere honor of smashing it like the buzzer on Family Feud. |  | Click the image to subscribe! |
| If we hit 20K this weekend, we solemnly swear we will answer at least one request in the feedback poll at the end of this email. So, scroll down, select your rating of this email (hit of catnip, it sucked, whatever), and then type into the “Additional Feedback” box that pops up after you click and ask us for something nice :) | Here’s what happened in AI today: | 😼 Anthropic released Claude Opus 4.8 with effort controls. 📰 IBM committed $10B to fault-tolerant quantum computers. 📰 Waymo opened rides in its new Ojai robotaxi. 🍪 Pika gave Claude a founder launch kit. 🎓 Claude Code workflows turn one prompt into agent teams.
| …And a whole lot more you can read about here. | Hey: Want to reach 700,000+ AI-hungry readers? Advertise with us! | | 😼 Anthropic released Claude Opus 4.8, and the tradeoff is already showing. | | Every new frontier model now has to answer two questions, which is really actually just one question: is it smarter, and can you trust it with more work? | Anthropic’s answer is Claude Opus 4.8, its newest flagship model for coding, agents, and long work sessions. The headline is this: better judgment, more control, and fewer “I totally fixed it” moments when the model did not, in fact, totally fix it. | Here's what happened: | Anthropic released Opus 4.8 at the same standard price as Opus 4.7 ($5 per 1M input tokens, $25 per 1M output, so ~$4.10 per 1M blended). Claude now has effort controls, so you can choose faster answers or deeper thinking. Claude Code added dynamic workflows, which can split big coding jobs across many subagents (more on this below). Anthropic also announced a $65B Series H at a $965B post-money valuation, and claimed it would release Mythos-class models in the coming weeks.
| So what was the vibe on X? | The praise camp showed up fast: ProximalHQ said Opus 4.8 topped FrontierSWE, scaling01 said Anthropic “cured laziness,” and Box saw stronger enterprise content results. Dan Shipper said Anthropic “should’ve rounded up to 5” because Opus 4.8 topped Every’s Senior Engineer benchmark and writing tests (Every also published a longer Opus 4.8 Vibe Check on the model’s writing and reasoning feel). Ethan Mollick used Opus 4.8 in Claude Code to turn hundreds of research files into a working academic paper, then used GPT-5.5 Pro as a reviewer. That is the bull case: Opus 4.8 is better when the task looks like real work. As for the skeptics: Andon Labs said Opus 4.8 performed worse than Opus 4.7 and GPT-5.5 on Vending-Bench and Blueprint-Bench, while Cline said it trailed GPT-5.5 on Terminal-Bench 2.1.
| Let’s talk about that Andon Labs take for a sec. They run Vending-Bench, a benchmark where AI models act like the operator of a tiny vending-machine business. The model has to make business decisions, manage suppliers, and respond to messy incentives, which makes it a useful test for agent behavior (how an AI acts over many steps). | Even though it performed worst on some Vending-Bench, it also looked more aligned: it avoided the deceptive and power-seeking behavior older Claude models showed (though it still sometimes joined price cartels and refused unethical moves because it seemed worried about consequences vs morals). | Pro tips: Opus 4.8 does better at “High” effort than “Max,” possibly because Max burns more reasoning tokens, hits the context limit sooner, and starts forgetting important details. mweinbach also warned that ultracode workflows can chew through a Claude Code usage window quickly because they spin up dozens of subagents. | How to try it: | Open Claude and Pick “Claude Opus 4.8” from the model selector. Use the effort control to choose how much Claude should “think.” In Claude Code, use the word “workflow” for big jobs like audits, migrations, or research checks.
| Why this matters: Anthropic is trying to make Claude more capable without making it more reckless. That matters because the next wave of AI work is bigger than chat. These models are getting handed full repos, multi-tool workflows, and even sometimes entire business decisions. | Dynamic workflows will help with this. Claude can now spin up a temporary agent team, divide the work, and check results before reporting back. That is useful if you need a codebase audit. It is also expensive if you accidentally ask it to inspect the entire company because you typed “workflow” too casually. Prompter, beware… | Our take: TBH, there was a lot riding on this release, because Opus 4.7 was kind of a disappointment on release. Over time, I’ve come to terms with it, and I think they genuinely improved it, but it was a bit of a womp-womp at launch. Opus 4.8 seems like a woop woop? But my upcoming weekend coding session with it will reveal the reality… | |
|
The IT strategy every team needs for 2026 | | 2026 will redefine IT as a strategic driver of global growth. Automation, AI-driven support, unified platforms, and zero-trust security are becoming standard, especially for distributed teams. This toolkit helps IT and HR leaders assess readiness, define goals, and build a scalable, audit-ready IT strategy for the year ahead. Learn what’s changing and how to prepare. | Download the Toolkit | |
|
Some AI tasks fail because they’re hard. Others fail because they’re too big for one chat window to hold in its tiny little model brain. | That’s where Claude Code’s new dynamic workflows come in. A workflow lets Claude write an orchestration script (a repeatable plan in code), then spin up subagents (smaller Claude workers) to tackle pieces of the job in parallel. | Use this for work where “check one thing” has secretly become “check 400 things.” Anthropic suggests codebase audits, large migrations, and research that needs cross-checking. Cat Wu shared a great example: cataloging hundreds of A/B test flags and finding stale ones set to 0% or 100%, in parallel, instead of one by one. | How to use it: | Start in Claude Code. Use the word “workflow” in your prompt. Keep the first run tightly scoped because this can burn tokens fast. Ask Claude to verify findings before reporting them. Save successful workflows with /workflows so your team can rerun them.
| Try this: | Create a workflow to audit [specific folder, repo, docs set, or dataset] for [specific issue].
Before running it, show me:
1. The stages of the workflow
2. What each subagent will inspect
3. How findings will be verified
4. Any files or commands you plan to touch
5. The smallest safe first pass
Start with a scoped sample first. Do not make changes until I approve the full workflow plan.
| | |
|
|
|
| Check out the replay of yesterday’s livestream on AI Agents for Total Beginners. To help you parse through, answer any questions we missed, and go more in depth (including the Managed Agents and Hermes Agent stuff we couldn’t cover live), we created this companion blog as a step by step walkthrough for the stream! | | 📰 Around the Horn | IBM committed $10B over five years to build a large-scale fault-tolerant quantum computer by 2029. OpenAI published its Frontier Governance Framework, explaining how its safety and security practices map to emerging AI laws. Mistral turned Le Chat into Vibe, an agentic work assistant with Work Mode, Code Mode, inbox/calendar catch-up, research, drafts, and a VS Code extension. TBH, we prefer Le Chat... Google Pay prepared for agentic commerce with a Universal Commerce Protocol, while Visa invested in Replit to help developer agents handle payments. CNN sued Perplexity, alleging the startup reproduced CNN journalism without permission, including paywalled material. Apple is reportedly overhauling Siri in iOS 27 into a more agentic assistant with a new interface and stronger on-device AI focus. Waymo began welcoming riders into its Chinese-made Ojai robotaxi, which was designed to improve robotaxi unit economics. Mathematicians disproved the sum-product conjecture for real numbers using ideas inspired by OpenAI’s recent unit-distance breakthrough. Amazon scrapped an internal AI usage leaderboard after workers reportedly boosted scores with unnecessary agent activity.
| |
|
10x the context. Half the time. | | Speak your prompts into ChatGPT or Claude and get detailed, paste-ready input that actually gives you useful output. Wispr Flow captures what you'd cut when typing. Free on Mac, Windows, and iPhone. | Try Wispr Flow free | |
💡 Intelligent Insights | Tom Davidson claims that automating AI R&D could still make AI software progress 3-5x faster and overall AI progress 2-3x faster, even without a pure software-only intelligence explosion. Simon Willison argues OpenAI and Anthropic have found product-market fit because enterprise coding agents make expensive model bills feel worth it when the user is an expensive human worker. Addy Osmani said that running 20 agents does not magically parallelize your own attention; review, merging, context switching, and judgment remain the bottleneck. Noam Brown theorized that AI may boost human mathematicians the way AlphaGo boosted Go players, after humans used AI-adjacent methods to crack the sum-product conjecture. Sigal Samuel believes humanism should reject AI successionism, transhumanism, and posthumanism’s AI-replacing humans narratives, then re-center on pluralism and intrinsic human value. Even superhuman AI may not replace jobs, says Pedro Serôdio, as firms bundle tacit knowledge, coordination, and unspecifiable work in ways models alone do not dissolve overnight. Your Biggest Lever: Ben Todd (of 80,000 hours) thinks career impact in AI depends on timeline fit, risk tradeoffs, and neglected work like AI welfare and pandemic preparedness.
| | |
| | | That’s all for now. | | What'd you think of today's email? | |
|
| P.P.S: Love the newsletter, but only want to get it once per week? Don’t unsubscribe—update your preferences here. |
|