Also True for Humans

I Interviewed an AI Version of GitHub’s COO—Then Spoke to the Real One

Here’s what my AI simulated interview taught me about getting the most out of a conversation

I’ve attended many tech conferences as a participant and a speaker, but this year’s Microsoft Build, the company’s flagship developer event, was my first as a member of the press.

To quell the imposter syndrome, I tried an experiment before I sat down with GitHub chief operating officer Kyle Daigle, a true GitHub veteran who joined the company as a developer 13 years ago. I built a simulated version of Kyle—an AI persona distilled from his public writing, talks, and interviews—and asked the AI Kyle the same questions I planned to ask the real one.

I expected the output to be either eerily accurate or useless. It was neither—precisely what made it valuable.

Out of 12 questions, two responses were strong matches, four were partial matches, and six were material misses. To the simulation’s credit, when it lacked evidence, it said so instead of inventing something. Those holes were the most useful prep—they showed me what information wasn’t available on the public record, and therefore where I should spend my time in the live interview.

I’ve spent a lot of time talking to AI personas. My last startup, Ask Rally, was a virtual focus group tool. We found that AI is no substitute for the real thing, but in high-stakes scenarios, roleplay can help you get out of your own head, build confidence in your strategy, and avoid costly mistakes. We’re more predictable than we think, with some studies showing 85 percent accuracy in AI personas replicating real human responses.

What follows is the actual interview, with notes on what the simulation got right, what it missed, and where the comparison is interesting. We also went back to human Kyle—and his take surprised us more than the AI answers.

1. Expanding the definition of a developer

Mike: The demographics of the customer are changing. A lot of people who may never have used GitHub or developer products before are now using them. How has that changed the way you decide the product roadmap?

Kyle Daigle: GitHub has always had an expansive view of what a developer is. I started as a developer before I would have called myself a dev. I was writing code for myself, and I did not go to school for computer science. I was going to art school and wrote code to pay for art school.

That journey is important: I can create tools with a team and deliver them to people who want to build an app for themselves, their family, a startup, or a business. GitHub has serious developer tools used by the largest businesses, but when I look at something like the GitHub Copilot app, I see both developers running multiple projects and agent sessions and people on our legal or finance teams using it. Customers tell us the same thing. People the industry might call knowledge workers, or non-developers by trade, are using these tools to build little apps or assets.

Our focus is still very much on developers, but we want to make it easier for people to try writing code. There should always be an on-ramp into creating software, including through tools like the GitHub Copilot app.

Simulation note: A partial match. The AI Kyle correctly predicted the real Kyle’s thesis: AI is expanding who gets to build software—and even offered a framework to test this that the real Kyle plausibly could have mentioned: “The design test I keep coming back to is ‘no net new behavior.’ New capabilities should fit into the places where software work already happens.” But it couldn’t produce the art-school story or the legal-and-finance-teams example that made the real answer compelling.

2. Helping maintainers handle a flood of pull requests

Mike: How do you help developers deal with the burden of all the extra pull requests? Open source maintainers I talk to are drowning. What needs to happen to help them?

Kyle Daigle: For developers generally, we are building tools like Copilot code review. It is now agentic, so it finds more novel vulnerabilities, and you can comment and have the agent implement a change. Code review is an overlooked way to get pull requests into a state where they are much easier to review.

Agentic merge is another example. A pull request can be almost ready, but there are still manual steps to finish processing it. Instead, I can define what GitHub Copilot is allowed to do and tell it to merge the pull request, wait for CI, and wait for policies.

Open source has a unique set of needs because maintainers do not control who sends changes. We are focused on giving maintainers more control: whether they want to accept pull requests, who they want to accept them from, and how much work a contributor needs to do to demonstrate that a contribution will be meaningful. Every community is choosing a slightly different approach. GitHub wants to provide the building blocks and leave maintainers in control. If a standard practice emerges, we can cement a system around it, but we do not want to impose one first.

Simulation note: The AI got the governing principle right and the substance wrong. Its best line—“The system should give maintainers explicit rules and guardrails, not just a larger inbox”—is something the real Kyle could have said. But it named zero products. Copilot code review, agentic merge, contributor acceptance controls: all invisible to a persona built from public material, because they weren’t publicized until the event.

3. Growth in agent-generated activity

Mike: You have a front-row seat to this new agent economy. You said publicly that you have had more pull requests submitted in a month than in all of last year. How are those stats exploding?

Kyle Daigle: We are seeing much more activity on GitHub. Last October at GitHub Universe, we shared that there had been one billion commits on GitHub for the full year. We are on track for 14 billion if growth is linear this year, which it will not be. In March, 17 million pull requests were created by agents alone.

There is much more code being created. Sometimes people dismiss it as slop: code pushed up that nobody cares about. That is not really true. We are leaving the super-early-adoption stage. We are not at the peak, but we are climbing the hill and learning what we can build when it is not just Kyle building, but Kyle plus one, two, or N agents using my skills, resources, and context.

We are investing heavily in preparing for the next wave of growth because this does not seem to be growing and then plateauing. No matter where people build or what tools they use, the code ends up on GitHub for sharing and collaboration. We need to support everyone’s agent moment, not just GitHub Copilot.

Simulation note: Miss. The AI recited last year’s public numbers—it can only re-serve the stats you already have.

4. Business models for always-on agents

Mike: How does the business model change? Freemium makes sense in a human-centered world where we go to bed, but agents are still working while we are asleep. Does that move things toward usage-based pricing?

Kyle Daigle: I do not think we know yet. Right now, Kyle has a license or uses GitHub.com for free, and we have always had API rate limits. That is usually where people see agent backpressure.

If you want to do much more, with something like 150 agents doing things at once, we want to enable that. At the same time, I want you to have a great core GitHub experience, with some amount of agent usage as a necessary part of it. It is similar to how GitHub evolved from free public repositories but no free private repositories to giving individuals free private repositories because people reasonably had code they did not want to put out into the world.

GitHub evolves as the industry and community evolve. We focus on making sure developers have what they need to be successful, then work with enterprises to make sure they have what they need at scale. Those needs are usually somewhat different.

Simulation note: Both Kyles opened with the same thought: Nobody knows yet. But the real Kyle’s answer was so much more illustrative because of the hypothetical examples he used to explain his point.

5. A dual role across GitHub and Microsoft

Mike: Pricing leads back into the wider Microsoft orbit. You now have a dual role, with partial responsibility for the wider marketing organization. How has that changed your work, and how do you prioritize between the two roles?

Kyle Daigle: I have been at GitHub for 13 years, as a developer myself and leading engineering teams for much of that time. What has always been unique about GitHub is that we focus intensely on the developer. Enterprises buying the tools is great, but we are not building for the buyers; we are building for the users.

That has been my focus as GitHub COO, which I continue to do. As Microsoft’s chief marketing officer of developer, my goal is to look across Microsoft’s developer tools and technology and make sure we bring holistic solutions that feel authentic to developer experiences.

At events like Build, we have taken a different approach this year. We are in San Francisco, and the vibe is different from a traditional conference-hall setup. The focus is: Can I go to a session? Can I use the thing? I do not want to be pitched a thing; I have to be able to use it. The goal is to bring GitHub’s expertise, care, and focus on the developer to a broader impact across Microsoft.

Simulation note: A clean miss. The AI Kyle questioned my facts: “The available persona... does not document a dual role with partial responsibility for a wider marketing organization. I would verify that premise before treating it as established.” Proof that simulations based on public materials have an expiry date.

6. Community speakers at Build

Mike: Did I hear you say that this is the first Build conference to have external contributors and speakers?

Kyle Daigle: It is the first Build where, by intention, we have focused on speakers from the community in primary sessions. The keynote included people such as Peter, and there were sessions from Svelte and others.

Software development is a team sport. It would be silly to think that one company or one group, including GitHub or Microsoft, could answer every question. We all use open source and build on major open source projects. We should invite people in to tell their part of the story together. That is what developers want, and feedback shows that people are excited to see perspectives from Microsoft, GitHub, and outside the company at the same event.

Simulation note: The AI declined to answer: “I cannot confirm that from the available persona.” But that’s what you want from a simulation—for it to admit when it doesn’t know.

7. Differentiating in a competitive market

Mike: This is a very competitive market, and the pace of change is quick. How do you differentiate?

Kyle Daigle: We continue to focus on our roots: developer choice, building for builders, and enabling builders. We went from an era of many APIs and broad access into a somewhat unintentional walled-garden setup, where people develop an affinity for one tool and then realize that trying something interesting elsewhere means learning a new tool or opening a new account.

We want developers building with GitHub to use other tools, and we will partner with everyone to make that as simple as possible. Other companies do similar things, but our ability to support choice across the entirety of building software, not only code generation or collaboration and review, is a real strength.

We will invest in our own technology, including new Microsoft AI models, while continuing to partner with Anthropic, OpenAI, Google, and anyone bringing a model or coding agent to market. We will let developers bring those tools to GitHub or use them through GitHub and GitHub Copilot. Choice is core. If we back down on that, developers will still choose; they will just be stuck in another walled garden.

Simulation note: This answer was the strongest match of the 12 to human Kyle’s response. The AI Kyle said: “The differentiator is not having one more agent in a crowded market...That means choice across providers rather than lock-in to a single model.” The real Kyle even reached for the same image—walled gardens. This was the part of an interview I least needed the simulated interview for.

8. Dogfooding while staying open to other tools

Mike: There was a recent news cycle about Claude Code licenses being canceled. How do you make the trade-off between dogfooding your own products, such as your new models or the GitHub Copilot desktop app, and letting developers experiment with other tools?

Kyle Daigle: We all use a variety of tools because otherwise you lose track and become too focused on your own work. I have been a daily MacBook user for years, and I use Windows PCs when I play video games. When I got this role, I set up my Mac, a PC, and an AlmaLinux box. I code most Saturdays, and I swap between the boxes because I want to understand each experience. I use the GitHub Copilot app only on Windows because developers on Windows deserve great apps too; it is not only about the Mac audience.

That is true across our teams. We look at coding agents, harnesses, desktop apps, memory management, and everything else. Everyone is building and using these tools. We put most of our energy into our own tools, but it is a blind spot to focus so narrowly that you lose perspective. When something new comes out, I want to know why people are having a great experience with it. That helps me understand where to focus and why a developer would pick a particular tool. The same goes for our teams.

Simulation note: This was the second strongest match. The AI nailed the concept in a line the real Kyle would happily sign his name to: “Dogfooding should sharpen the product. It should not become a reason to ignore the tools developers find useful.” It could not have known, however, that the COO of GitHub spends Saturdays rotating between a Mac, a PC, and an AlmaLinux box.

9. Filtering short-lived ideas from longer-term bets

Mike: A lot of these ideas are relatively short-lived, while enterprise product-development cycles are longer-lived. How do you filter ideas and decide what to pursue?

Kyle Daigle: In the short term, we are focused on supporting a multitude of agent sessions because that seems clear: everyone is doing it, so how can we cement it? Over the longer term, models will continue to improve, token economics will become a bigger factor in what people use, and I believe we are not far from having serious ability to use something above a small language model on a local device for some work.

If there is that much optionality around tokens, the consistent truth since ChatGPT and GitHub Copilot emerged is personalization, context, fine-tuning with context, and memory. There have been experiments across the industry, but not a long-term vision. Supporting many agents is important because someone using agents will not sit and stare at one agent working. But that alone will not produce a great long-term experience. A great experience is using an agent that feels like it is completing a thought for you without forcing you to codify that thought yourself.

The agent should be able to intuit that, or use post-training, fine-tuning, or frontier tuning to deeply understand how I work. Sometimes we work on the short term, and sometimes we take repeated attempts at the long term until something tangible helps us move forward.

Simulation note: This is where the simulation’s favorite crutch broke down. The AI Kyle reached—again—for human Kyle’s best-documented framework: “I use a constraint-first hierarchy: MUST, SHOULD, and COULD.” It used that framework in five of my 12 questions, and the real Kyle never mentioned it once. When a persona runs out of evidence, it over-applies the pattern it knows best. That’s a subtler failure than hallucination, but a failure nonetheless. As models get better, they should get less fixated on individual pieces of context and relax rules like humans do.

10. Hill climbing as a product-development loop

Mike: I heard the term “hill climbing” 100 times yesterday. Can you talk about how that became such a big focus?

Kyle Daigle: Satya [Nadella], Mustafa [Suleyman], and Jacob [Andreou] leading the Copilot group talk about it often. The biggest thing we have learned is that using the tools has to be a core way to improve the underlying experience of the models. The evals are essential. Thumbs-up and thumbs-down data, whether people accept a suggestion, and how much they accept all contribute to creating a useful experience for everyone.

Every week we talk about hill-climbing results. We look at hard measures and soft measures because sometimes evals and rubrics show improvement while user sentiment crashes, even with the same latency and performance. You can overfit.

The goal is to run that loop quickly, then give everyone a hill-climbing machine without forcing them to do it the hard way. In an enterprise using M365, there is potentially rich data in assets, documents, and chats. Turning on something like frontier tuning and using an MAI Phi-3 model as a base can show real results without extra work. At first, that sounded like a magic trick that could not be real. But sometimes the opportunity is where something feels too simple to work.

That is why we say hill climbing so much. It is not a moonshot. It is climb, improve, add an eval, improve, add new data, and improve again. That is how we reached a point where we can launch models for ourselves and allow customers to use similar tooling.

Simulation note: The AI declined to bluff by saying, “The available persona does not document me using ‘hill climbing’ as a specific organizational term, so I would not manufacture an origin story for it,” then provided some abstract explanations of fast iteration. The real answer included new information that the model would never have had access to.

11. Keeping AI subscriptions affordable

Mike: Is hill climbing the answer to stopping a $200 subscription from becoming a $2,000 subscription?

Kyle Daigle: Frontier tuning models so they know you better is part of the answer. Another important part is helping developers automatically choose models.

Mike: Like the model router in GitHub Copilot?

Kyle Daigle: Exactly. GitHub has an automatic model router with task intent, and Microsoft Foundry has a model router at the API level. The more we can let people tell us where their bars are, such as “This is an incredibly hard problem and I am willing to go all the way to the top,” or, “I want to stay here,” the more we can help choose the model.

Tokens are often expensive because people choose the model of the day, week, or hour, and those models can be expensive. But a train of thought moves between hard problems and simple ones. I might ask an agent to do an enormous amount of work, then have a final step that is just changing names. I probably will not manually switch from an expensive model down to a smaller one just to save tokens for that step, but the tools could. That will help enterprises, individual developers, and people building automations with the Copilot SDK.

Simulation note: The AI got the gist of the problem correct, but it couldn’t solve it. It gave a framing for the problem, noting that a model doing more work doesn’t automatically mean a cheaper bill, while the real Kyle talked about a thing to solve the problem: the automatic model router

12. Unusual personal uses for agents

Mike: I made an AI version of you to practice this interview and found it immensely useful. What other unusual things are you seeing people do with agents internally or externally?

Kyle Daigle: I do something similar. I have one setup through the app and another Claude instance that cannot talk to work systems, so there is separation of state. I spend a lot of time having it read what I write. This interview will ultimately get fed into it. Every day I receive a communications report that says things like, “Kyle, you keep saying this,” or, “This is not super clear.” Because I write and speak in a particular way and like to use metaphors, it gives me examples of metaphors that are clear.

The self-improvement loop for a human is incredibly powerful. We used to talk about this with Hubot and ChatOps at GitHub: Humans are much more willing to take critical feedback from robots than from other humans. When my Claude instance tells me how poorly I did at something, I feel better asking it to explain why and using that feedback when I write emails, write a script, or review details.

A lot of my agent loop is about me rather than software. It looks backward: read Kyle’s emails and Slack messages from the last seven days, give feedback, then look back at the advice and check whether Kyle acted on it. That loop is extremely powerful. It is the kind of personal consumer AI experience I want.

Simulation note: Funny that the question about simulated interviews was also a miss. Asked to name unusual agent uses, the AI Kyle admitted it didn’t have much to offer and fell back on vague talk about removing toil. The real Kyle’s answer revealed a concrete, personal workflow that’s truly interesting and helpful.

The best interview preparation is knowing where to dig deeper

Every managing editor Eleanor Warnock wrote recently about what she calls Socrates-as-a-service: the human skill of pulling out ideas people haven’t yet put into words, like the anecdote that becomes a front-page story or the detail that crystallizes a philosophy into something a reader remembers.

That’s exactly the gap this experiment helped me see. The simulation knew about Kyle’s positions on developer choice and walled gardens because he’s made them clear in public for years. It couldn’t know that he codes on Saturdays and rotates between three machines to stay honest about other people’s tools.

In response to a post-real interview request for comment about the experiment, Daigle said, “I thought the simulated interview was pretty good! Mostly, it overindexed on my written work rather than my spoken interviews and podcasts. Without access to everything that I’ve written internally, it went harder on topics I’ve spoken about on my blog, etc. than I normally speak about.”

This is the real argument for doing an AI-simulated test run before an interview. It will show you where the gaps in the public record are, so you can spend the interview filling in those gaps and extracting truly original, scarce knowledge for your readers and the world.

Daigle found a use for my answers, too. “Even reading the AI responses, I found it clarifying my thinking and the sharpness of my answers, so it helped me, too.”

Then he added: “I actually did a similar thing for Mike. What questions I expected him to ask. I didn’t save the output—I garbage-collect a lot of it—so it’s funny how we’re all operating.”

Maybe the joke was on me all along.

The simulated Kyle Daigle was built from public material current through May 31, 2026, using a modified version of the open-source library SynthTeam, and scored against the real transcript afterward: two strong matches, four partial, six misses. Full methodology and the unabridged synthetic interview are available on request.

Mike Taylor is the head of tech consulting at Every and a co-author of Prompt Engineering for Generative AI (O’Reilly). To read more essays like this, subscribe to Every, and follow us on X at @every and on LinkedIn.

We also do AI training, adoption, and innovation for companies. Work with us to bring AI into your organization.

For sponsorship opportunities, reach out to sponsorships@every.to.