The Human in the Loop: Why the Creator of the World's Most Famous AI Coder Is Begging You Not to Fire Your Developers

On a sweltering Thursday in late May, Scott Wu did something that, by the logic of the current AI hype cycle, should be economically irrational: he told the world that his own product shouldn't replace human beings. Wu, the three-time International Olympiad in Informatics gold medalist and co-founder of Cognition AI, is the architect of Devin—arguably the most successful AI coding agent on the market, a tool that has terrified junior developers and thrilled venture capitalists in equal measure. Yet in an interview published May 29, Wu explicitly stated that Devin "isn't designed to supplant human programmers" [1]. This is not false modesty. It is not a founder trying to downplay expectations ahead of a down round. Cognition had just announced a $1 billion fundraising round at a $25 billion pre-money valuation two days prior, with an annualized revenue run rate of $492 million [2]. Wu speaks from a position of immense leverage. And his message cuts directly against the grain of an industry that has spent the last eighteen months convincing itself that software engineers are about to go the way of the switchboard operator.

The timing of this message is everything. We are living through a peculiar moment in enterprise software—a moment when the people building the tools are also the people telling you not to use them too aggressively. To understand why Wu is taking this stance, and why it matters far beyond Cognition's Palo Alto offices, we need to examine the technical realities of AI coding agents, the financial incentives of the companies that sell them, and the human cost of treating software engineering as a commodity to be automated away.

The $25 Billion Contradiction

Let's start with the numbers, because they are genuinely staggering. Cognition AI, a company that barely existed in the public consciousness two years ago, has reached a $25 billion pre-money valuation on the back of a product that—by its own creator's admission—is not designed to replace the very people who would buy it [1][2]. The $1 billion raise is not a sign of desperation; it signals market conviction. With $492 million in annualized revenue run rate, Cognition has more than doubled its valuation in eight months [2]. These are not the numbers of a company struggling to find product-market fit. They belong to a company that has found something enterprises are willing to pay for, and pay for handsomely.

But here is the contradiction the market has not fully processed: if Devin genuinely replaces human programmers, then the total addressable market for the product is, by definition, shrinking. Every engineer that Devin replaces is one less engineer who can advocate for, configure, or expand Devin within an organization. The economics of automation have always had this self-limiting property—the more you automate, the fewer humans remain to manage the automation. Wu seems to understand this intuitively. By positioning Devin as a collaborative tool rather than a replacement, he protects his own market. A world in which Devin augments every developer is a world in which Cognition sells a license for every developer. A world in which Devin replaces developers is a world in which Cognition sells a license for a shrinking number of humans, and then eventually sells nothing at all because no one remains to buy it.

This is not merely cynical positioning. A genuine technical argument exists here, one that Wu has been making consistently. Devin is an AI coding agent—it can write code, debug issues, deploy applications, and even manage pull requests. But it operates within constraints invisible to the casual observer. It does not understand business context. It cannot navigate office politics. It cannot make judgment calls that require understanding why a particular feature matters to a particular customer at a particular moment. These are not bugs that can be patched in the next release. They are fundamental limitations of the current architecture of large language models.

The Return of the Native: What New Mothers See That Executives Don't

To understand what Wu is actually protecting, it helps to look at a demographic rarely centered in conversations about AI productivity: new mothers returning to software engineering jobs after parental leave. A Wired report published May 28 paints a stark picture of women coming back to "an AI-pilled workplace they barely recognize" [4]. These are not junior developers afraid of being automated. These are experienced engineers who left for a few months and returned to find that the entire nature of their job had transformed in their absence.

The Wired piece captures something that valuation analysts at venture capital firms tend to miss: the human cost of rapid AI adoption is not evenly distributed. New mothers, who already face significant career penalties for taking time off, now return to codebases partially rewritten by AI agents, workflows restructured around AI copilots, and performance metrics that assume everyone uses AI tools effectively [4]. The cognitive load of re-entering a workplace that has been "radically reshaped" is immense, and it falls disproportionately on people already navigating the challenges of new parenthood.

This is the hidden externality of the AI coding boom. When executives talk about "10x productivity gains" from tools like Devin, they implicitly assume that the humans in the system can adapt instantly to the new paradigm. But adaptation takes time, and time is not free. The Wired article suggests that return-to-office mandates many tech companies have implemented are colliding with the AI transformation in ways particularly punishing for women [4]. You cannot simultaneously demand that employees return to the office five days a week and expect them to seamlessly integrate AI tools that were not part of their workflow when they left. The friction is real, and it is being absorbed by the most vulnerable members of the engineering workforce.

This is where Wu's message becomes more than just a PR strategy. If Devin is truly designed to augment rather than replace, then integrating it into a team should not require everyone to relearn their job from scratch. But evidence from the field suggests this is not happening. The Wired piece documents cases where entire codebases have been refactored by AI agents, leaving returning engineers to reverse-engineer the logic behind the changes [4]. That is not augmentation. That is replacement by another name.

The Polymarket Problem: When Engineers Know Too Much

The tension between human judgment and automated systems is playing out in other domains as well, and the results are not always flattering to the humans involved. On May 28, the FBI charged a Google software engineer named Michele Spagnuolo with insider trading after he allegedly used internal search data to win $1.2 million on Polymarket bets [3]. Spagnuolo, an Italian citizen living in Switzerland, was arrested and brought before a federal judge in New York [3]. The charges stem from bets related to which public figures would top Google's rankings for the most searched names in 2025.

This case is relevant to the Devin conversation for reasons that may not be immediately obvious. Spagnuolo was a software engineer at one of the most data-rich companies in the world. He had access to internal search data the public could not see. He used that access to make informed bets on a prediction market. The FBI's case is straightforward: that constitutes insider trading because the information was material and non-public [3].

But think about what this means for the future of AI coding agents. If a human engineer can face insider trading charges for using internal data to make predictions, what happens when an AI agent does the same thing? Devin, like all coding agents, operates on the codebase it is given access to. If that codebase contains proprietary information about upcoming products, user data, or financial projections, the agent's outputs could inadvertently violate securities laws. The agent does not know it should ignore certain data. It has no concept of material non-public information. It simply processes whatever it receives.

This is not a hypothetical concern. As AI coding agents become more autonomous, they will inevitably gain access to more sensitive data. The Spagnuolo case demonstrates that the legal system is already thinking about how to police the boundary between legitimate access and improper use of information [3]. But the law is designed for human actors who can form intent. An AI agent cannot form intent. It cannot know it is doing something wrong. This creates a liability gap that companies like Cognition are only beginning to grapple with.

Wu's insistence that Devin should not replace humans takes on a different dimension in this context. If a human is always in the loop, then the human can be held responsible for the agent's actions. The human can review the agent's outputs for compliance with securities laws, data privacy regulations, and company policies. Remove the human, and you remove the accountability structure. The agent becomes a black box that can generate legally actionable outputs without any human being able to say, "I reviewed that and approved it."

The Architecture of Augmentation: What Devin Actually Does

To understand why Wu is taking this position, we need to look at what Devin actually does and, more importantly, what it does not do. Devin is an AI coding agent—a system that can take a high-level task description and autonomously write, test, and deploy code to accomplish that task. It is built on top of large language models, but it is not simply a chatbot that generates code snippets. Devin has its own development environment, its own terminal, its own browser. It can plan multi-step software engineering projects, debug its own errors, and even manage pull requests.

But here is the critical detail that gets lost in the hype: Devin is not creative in the human sense. It cannot invent a new architecture for a problem that has never been solved before. It cannot look at a messy business requirement and infer the unstated assumptions behind it. It cannot navigate the social dynamics of a team to understand why a particular approach was rejected in favor of another. These are not limitations that can be solved by scaling up the model. They are fundamental to the nature of the technology.

The table-transformer-structure-recognition model on HuggingFace, downloaded over 1.3 million times, is a good analogy for what Devin can and cannot do. That model excels at recognizing the structure of tables in documents—it can identify headers, rows, columns, and relationships between data points. But it cannot tell you whether the data in the table is accurate. It cannot tell you whether the table is relevant to your business question. It cannot tell you whether the table should exist at all. It is a tool for a specific task, and it is very good at that task, but it has no understanding of the context in which the task is being performed.

Devin is similar. It writes code that conforms to established patterns with remarkable reliability. It can implement a REST API endpoint, write unit tests, and deploy to a cloud environment. But it cannot tell you whether the endpoint should exist in the first place. It cannot tell you whether the business logic is correct. It cannot tell you whether the feature you are building will actually solve the customer's problem. Those judgments require human understanding of the business context, the competitive landscape, and the user's actual needs.

The $492 Million Question: Who Is Buying, and Why?

The $492 million in annualized revenue run rate that Cognition has achieved is not coming from companies that have replaced their entire engineering workforce with AI agents [2]. It comes from companies that have purchased Devin as a productivity tool for their existing engineers. The buyers are CTOs and VPs of Engineering looking to squeeze more output from their current teams without hiring additional headcount. They are not looking to fire everyone and let the AI run the show.

This is a important distinction the market is only beginning to understand. The enterprise software buying cycle is driven by budget holders measured on team output, not team size. A CTO who can deliver 20% more features with the same headcount is a hero. A CTO who fires half the team and hopes the AI can pick up the slack is a liability. The risk of catastrophic failure—a production outage, a security breach, a compliance violation—is simply too high to entrust critical systems to an autonomous agent without human oversight.

This is why Wu's message is actually good for business. By telling customers that Devin is not a replacement tool, he gives them permission to buy it without fear. He signals that Cognition understands the enterprise risk profile and builds for augmentation, not automation. The $25 billion valuation reflects the market's belief that this approach is correct [2]. If Cognition had positioned Devin as a replacement tool, the valuation would likely be lower, because the total addressable market would be perceived as smaller and riskier.

But a tension exists here that cannot be resolved by messaging alone. The technology is advancing rapidly. The face_recognition library on GitHub, with 56,190 stars and 13,704 forks, is written in Python and describes itself as "the world's simplest facial recognition api for Python and the command line." It is a general-purpose tool downloaded millions of times. The point is that even relatively simple AI tools can achieve widespread adoption when they are easy to use and solve a clear problem. Devin follows the same trajectory. As it gets better, the temptation to use it more autonomously will grow. The line between augmentation and replacement is not fixed. It moves as the technology improves.

The Hidden Risk: What the Mainstream Media Is Missing

The mainstream coverage of AI coding agents has focused on two narratives: the utopian vision of 10x productivity gains and the dystopian vision of mass unemployment. Both narratives miss the real story, which is about the changing nature of software engineering as a profession.

The Wired piece about new mothers returning to AI-reshaped workplaces captures a piece of this, but the full picture is broader [4]. Software engineering has always rewarded continuous learning. The half-life of technical knowledge in this field is measured in years, not decades. But the pace of change has accelerated dramatically with the introduction of AI coding agents. Engineers who took six months of parental leave return to find the tools they mastered are obsolete. Engineers who spent years building expertise in debugging find that the AI can debug faster than they can. Engineers who prided themselves on writing clean, efficient code find that the AI can write code cleaner and more efficient than theirs.

This is not replacement in the literal sense of firing people and hiring AI. It is replacement in the more insidious sense of making human expertise less valuable. When the AI can do 80% of what a senior engineer can do, the premium that senior engineer can command for their remaining 20% of unique value compresses. The engineer is not fired, but their bargaining power diminishes. Their salary growth slows. Their career trajectory flattens.

The Spagnuolo case adds another dimension [3]. If engineers use internal data to make personal bets, that suggests a level of disengagement worrying for employers. But it also suggests something else: engineers are bored. When the AI handles routine work, the human engineer is left with either high-level strategic thinking or nothing at all. Some engineers will thrive in this environment. Others will find ways to occupy their time that do not align with their employer's interests.

The Editorial Take: Why Wu Is Right, But Not for the Reasons He Thinks

Scott Wu is correct that AI coding agents should not replace humans. But the reasons he gives—that Devin is designed for augmentation, that human judgment is irreplaceable, that the technology has fundamental limitations—are only part of the story. The deeper reason is that replacement is a bad business model for the companies building these tools.

Consider the economics. If Devin replaces a human engineer, Cognition loses a potential customer. That human engineer is no longer there to advocate for the tool, to configure it, to train new users on it. The organization that replaced the engineer now has fewer people who understand how to get value from Devin. Over time, the tool becomes underutilized, and the subscription is not renewed. This is the opposite of what a SaaS company wants. A SaaS company wants to maximize the number of users per account, not minimize it.

This is the insight the market is missing. The $25 billion valuation of Cognition is not a bet on automation. It is a bet on augmentation [2]. It is a bet that enterprises will continue to employ human engineers and will pay for tools that make those engineers more productive. It is a bet that the human engineer is not going away, but that the nature of their work will change.

The question is whether this bet is correct. Evidence from the field is mixed. The Wired piece suggests the transition is painful, especially for people already marginalized in the tech industry [4]. The Spagnuolo case suggests some engineers are responding to the changing nature of their work in legally problematic ways [3]. The technical limitations of current AI systems suggest full automation is still a distant prospect, but the rate of improvement is accelerating.

What is clear is that the conversation about AI and employment needs to move beyond the binary of replacement versus augmentation. The reality is more complex. Some tasks will be automated. Some jobs will be eliminated. But new tasks will emerge, and new jobs will be created. The challenge is managing the transition in a way that does not leave people behind.

Wu's message, whether he intends it or not, recognizes this complexity. He is not saying AI coding agents will never replace humans. He is saying they should not, and that the companies building these tools have a responsibility to design them in a way that preserves human agency. It is a noble sentiment, and it is also good business. The question is whether it will survive contact with the market's relentless demand for efficiency.

The answer, as always, will depend on the choices executives make in the coming years. They can use tools like Devin to augment their teams, investing in training and support to ensure no one is left behind. Or they can use them to squeeze every last ounce of productivity out of their workforce, treating human engineers as expensive components to be minimized. The technology is the same. The outcome is not.

Scott Wu has made his choice clear. The rest of the industry will have to make theirs.

References

[1] Editorial_board — Original article — https://techcrunch.com/2026/05/29/cognitions-scott-wu-says-ai-coding-agents-shouldnt-replace-humans/

[2] TechCrunch — AI coding startup Cognition raises $1B at $25B pre-money valuation — https://techcrunch.com/2026/05/27/ai-coding-startup-cognition-raises-1b-at-25b-pre-money-valuation/

[3] Ars Technica — FBI says Google engineer used internal search data to win $1.2M on Polymarket — https://arstechnica.com/tech-policy/2026/05/fbi-says-google-engineer-used-internal-search-data-to-win-1-2m-on-polymarket/

[4] Wired — New Moms Are Returning to Coding Jobs Radically Reshaped by AI — https://www.wired.com/story/women-parental-leave-return-office-ai/

Cognition’s Scott Wu says AI coding agents shouldn’t replace humans

The Human in the Loop: Why the Creator of the World's Most Famous AI Coder Is Begging You Not to Fire Your Developers

The $25 Billion Contradiction

The Return of the Native: What New Mothers See That Executives Don't

The Polymarket Problem: When Engineers Know Too Much

The Architecture of Augmentation: What Devin Actually Does

The $492 Million Question: Who Is Buying, and Why?

The Hidden Risk: What the Mainstream Media Is Missing

The Editorial Take: Why Wu Is Right, But Not for the Reasons He Thinks

References

Was this article helpful?

Related Articles

NVIDIA Nemotron Achieves Benchmark-Leading Performance With LangChain Deep Agents Harness

Hugging Face and Cerebras bring Gemma 4 to real-time voice AI

Anthropic says Alibaba illicitly extracted Claude AI model capabilities