AI Is Writing More Code Than Humans Can Review — and That’s Becoming a Big Problem

Engineering has always meant two things at once: Someone with a name is accountable, and work that can be verified. The second half is being deleted.

By Prince Kohli | edited by Chelsea Brown | Jun 03, 2026
Comment

Opinions expressed by Entrepreneur contributors are their own.

Key Takeaways

  • AI agents ship code faster than humans can review it. This has created an accountability gap where vulnerabilities can ship without a clear owner responsible for the outcome.
  • Engineering has always meant two things at the same time: Someone with a name is accountable, and the decisions that person made are verifiable by anyone willing to check.
  • That principle needs to follow the new performers (agents), not be discarded because of them.

On March 3, a security researcher told Lovable, the $6.6 billion AI coding platform, that its apps were leaking customer data. HackerOne, the disclosure platform, closed the report as a duplicate. The flaw sat open for 48 days. On April 20, the researcher went public. Among the exposed accounts were employees of Nvidia, Microsoft, Uber and Spotify.

Lovable’s first response was a denial. Its second blamed the documentation. Its third blamed HackerOne. None of those answered the question this industry needs to start answering out loud: When an AI ships software that leaks customer data, who is responsible, and how do we know?

The second half of that question is the one engineering actually knows how to solve. We have solved versions of it before. Engineering, as a discipline, has never just been “build the thing.” It has always meant two things at the same time: Someone with a name is accountable, and the decisions that person made are verifiable by anyone willing to check.

A bridge engineer signs off on the load calculations because someone has to be on the hook, but also because the math itself is reproducible. A pharmaceutical batch is signed by a person, but the batch is traceable because the data exists. The two halves work together. Take either one out, and you don’t have engineering anymore. You have manufacturing on a hope and a prayer.

That coupling, responsibility plus reproducible evidence, is what every grown-up industry figured out over a century of failure. It is also the quiet thing software is at risk of losing right now, not because anyone has decided to lose it, but because the rate at which agents now ship code has outrun the systems we use to verify what they shipped.

Consider the volume. At Stripe, internal AI agents called Minions ship more than 1,000 pull requests a week. Anthropic launched a dedicated code-review product because, as its head of product put it, “Claude Code is putting up a bunch of pull requests.” Cognition’s Devin agent has merged hundreds of thousands of PRs across thousands of customer companies in 18 months. Former GitHub CEO Thomas Dohmke says the part out loud: “Soon, developers won’t look at the code anymore, as agents will write way more than humans can review.” And of course, YOLO mode in Claude code is gaining in popularity.

That, by itself, is not the problem. Agents writing code is the new normal, and we at Sauce Labs have been building towards it for years. The problem is that the systems used to verify what code does — test coverage, security scans, the chain of custody from commit to deploy — were designed around the assumption that a human had already looked at it. Take the human out, and the verification doesn’t automatically shift to fill the gap. It quietly disappears.

The most recent Veracode benchmark, two years and a hundred-plus model releases later, found that the security pass rate of AI-generated code has moved from approximately 55% to… approximately 55%. The syntax got better. The judgment did not. That is not a model problem. It is a systems problem. Humans used to be the part of the pipeline that asked “Is this safe to ship?” and that part of the pipeline has not been replaced. It has been deleted.

Other industries are figuring out how to update their principles for the new performers faster than software is. In late April, California’s DMV adopted new rules, taking effect July 1, that finally let police ticket the company that runs a driverless car. State traffic laws were built on the assumption of a human driver, so until now, an officer who watched a Waymo execute an illegal U-turn could only call the company. Under the new framework, when the autonomous vehicle commits a moving violation, the company is treated as the driver. Seventy-two hours to file a report; twenty-four if there is a collision. Repeated violations can cost the company its fleet.

That is a small regulatory adjustment. It is also the exact shape of what every industry has to do when a new kind of performer enters the work. You do not abandon the principle that someone is accountable. You move the principle to fit the new performer. The driver was once a person; now the driver is a software stack with a parent company. The principle — that there is a driver and the driver answers — survives.

Software has not yet made that move. Lovable, on April 20, could not name a driver. It blamed its documentation. It blamed a vulnerability classification. It blamed the researcher who reported it. The agents that wrote the code did not get fired. Nobody did. Nobody could.

The insurance industry, as it tends to, has already started pricing the gap in. As of January, the largest U.S. commercial-policy framework formally excludes coverage for harms tied to generative-AI outputs. The European Commission, drafting AI liability rules it has since shelved, used a sentence that reads like it was lifted from a postmortem: AI’s complexity, autonomy and opacity, it wrote, “make it difficult or prohibitively expensive for victims to identify the liable person.” That sentence is the entire problem in 18 words. The driver is opaque. The principle that there must be a driver hasn’t moved with it.

So when I hear that “developers won’t look at the code anymore,” I don’t hear a technological claim. I hear an industry deciding, without saying so, that the second half of engineering, the part where someone with a name answers for the work, has become optional. It hasn’t. It can’t.

Engineering, as a word, has always meant something more specific than “writing code.” It has meant: I built this, the decisions are reproducible, and if it fails, that’s on me, and you can check my work. That definition does not become obsolete because agents now do most of the typing. If anything, it becomes more important. The agent is the new performer. The principle — that someone with a name is accountable for what the agent shipped and that the data exists to verify what was shipped — has to stay current. Not as nostalgia for an older way of building software, but as the foundation any new way has to be built on.

The breach is going to come, probably more than one. The agents will not get fired. The only question worth asking is whether, by the time it happens, this industry will have done the unglamorous work of moving its first principle into the era of the new performers. Put a name next to every commit. Make that name mean something. Build the verification systems that let that name actually answer when the agent gets it wrong.

That is the version of engineering worth defending. Not the old one. The current one.

Key Takeaways

  • AI agents ship code faster than humans can review it. This has created an accountability gap where vulnerabilities can ship without a clear owner responsible for the outcome.
  • Engineering has always meant two things at the same time: Someone with a name is accountable, and the decisions that person made are verifiable by anyone willing to check.
  • That principle needs to follow the new performers (agents), not be discarded because of them.

On March 3, a security researcher told Lovable, the $6.6 billion AI coding platform, that its apps were leaking customer data. HackerOne, the disclosure platform, closed the report as a duplicate. The flaw sat open for 48 days. On April 20, the researcher went public. Among the exposed accounts were employees of Nvidia, Microsoft, Uber and Spotify.

Lovable’s first response was a denial. Its second blamed the documentation. Its third blamed HackerOne. None of those answered the question this industry needs to start answering out loud: When an AI ships software that leaks customer data, who is responsible, and how do we know?

The second half of that question is the one engineering actually knows how to solve. We have solved versions of it before. Engineering, as a discipline, has never just been “build the thing.” It has always meant two things at the same time: Someone with a name is accountable, and the decisions that person made are verifiable by anyone willing to check.

Prince Kohli CEO of Sauce Labs

Entrepreneur Leadership Network® Contributor
Prince Kohli is CEO of Sauce Labs with nearly 30 years building AI-driven enterprise solutions.... Read more
Join the Conversation
Leave a comment. Be kind. Critique ideas, not people.
Sort: |

Related Content