Why AI Isn't Making Developers as Productive as You Think

Developers think they're 20% faster with AI code generation tools. Recent research shows they're actually 19% slower. But I've also worked with teams that are genuinely twice as productive with AI. I spent months consulting for DX (recently acquired by Atlassian) building a code attribution system that could accurately detect whether code was written by AI, copy-pasted, or typed by humans. What I discovered explains both results perfectly: AI code generation is only part of the picture.

The industry is obsessed with how fast AI can write code. The productivity gains are real - sometimes massive - but they're highly context-dependent. Here's what nobody's talking about: in many organisations, coding is only about 30% of a developer's actual time. Even if you make coding 80% faster, you've only improved overall productivity by roughly 10-15%. For startups with minimal process overhead, the gains can be transformative. For larger organisations with extensive review processes, the benefits get diluted across the entire software development lifecycle.

The real question isn't whether AI makes coding faster - it obviously does. It's whether faster coding translates to faster shipping, and that depends entirely on how your organisation works.

The Measurement Problem Nobody's Solving

Building the code attribution system for DX was a fascinating technical challenge. The core detection mechanism relied on measuring the speed of code insertion - when a large block of code appears almost instantaneously, that's a strong signal it's AI-generated rather than human-typed. By tracking file changes frequently enough, we could determine with remarkable accuracy whether code was written by AI, copy-pasted from elsewhere, or genuinely hand-crafted.

The copy-paste detection was simple - essentially SQL queries against previously stored files to identify duplicated code blocks. The really difficult part wasn't the AI detection at all. It was reliably detecting Git events across different workflows, handling massive pull requests with thousands of files, and maintaining performance whilst processing all this data in real-time.

What made this work valuable wasn't just the technical achievement. It was that companies finally had actual data rather than developer feelings. You could roll up metrics by individual, by team, see which types of changes took longer to review, and crucially, start to understand the actual impact of AI code generation on delivery speed.

Some engineers are using AI constantly. Others barely touched it. But here's what surprised me most: the relationship between AI-generated code and actual delivery speed was far more complex than anyone expected. In some contexts - particularly smaller teams or startups with streamlined processes - developers were genuinely flying. In others, particularly larger organisations with more rigorous review processes, the coding speed gains were being eaten up elsewhere in the cycle.

The 30% Problem: Where Developers Actually Spend Time

Here's where things get interesting. Let me break down where a typical developer's time actually goes in many organisations:

Standups and sync meetings: 10%
Planning and refinement: 15%
Pull request reviews: 20%
Implementation discussions and architecture: 15%
Actual coding: 30%
Context switching and other overhead: 10%

Now do the maths. If AI makes you 80% faster at writing code - which is generous - you've improved 30% of your time by 80%. That's a 24% improvement on 30% of your work. Your overall productivity increase? About 7-8% at best, possibly 10-15% if you're lucky.

But here's where context matters enormously. A three-person startup with minimal process overhead might spend 60% of their time actually coding. For them, AI code generation is genuinely transformative - they can be twice as productive. A 500-person enterprise with extensive review processes, compliance requirements, and coordination overhead might spend 20% of time coding. For them, the same AI tools barely move the needle on overall delivery speed.

This explains why research shows such contradictory results. GitHub's studies show massive productivity gains because they measure coding tasks in isolation - perfect for understanding one part of the picture. METR's research showed developers were 19% slower with AI because they measured complete feature delivery in complex projects with experienced developers.

Neither study is wrong. They're measuring different things in different contexts. The real insight is that AI code generation is only part of the software development lifecycle. Whether it's a small part or a large part depends entirely on how your organisation works.

The Large PR Problem AI Can Create

Here's an interesting side effect I've noticed: AI code generation can make certain bottlenecks worse, but only in specific contexts. When you can write code incredibly quickly, it's easy to get carried away. I've done it myself. You ask Claude or Cursor to implement something, it generates a beautiful solution, you're impressed, so you ask it to do the next thing, and the next thing, and before you know it you've got a 2,000-line pull request touching 47 files.

If you're in a small team that ships to production multiple times a day with minimal review, this isn't a problem. You merge and move on. But in organisations with more rigorous review processes - which includes most mid-size and large companies - that 20% of time spent on code review just became 30%. The PR sits there for days because nobody wants to review something that massive. When they finally do review it, they're skimming rather than carefully examining the logic because it's overwhelming.

This is particularly problematic with AI-generated code because it often looks good at first glance - it's well-formatted, follows conventions, has error handling. But AI can introduce subtle bugs or make questionable architectural decisions that only become apparent during careful review. With a massive PR, those issues slip through.

Security review becomes even more critical. AI will cheerfully log passwords in plain text, expose secrets in error messages, or implement authentication with glaring vulnerabilities - all whilst looking like perfectly reasonable code. In a 200-line PR, you spot this. In a 2,000-line PR, you might not. I've written about avoiding critical security mistakes in AI-generated code based on shipping AI code to production for two years.

The Missing Pieces of AI-Assisted Development

The real opportunity is beyond making coding faster. It's optimising the entire software development lifecycle for an AI-assisted world. Here's what that actually looks like:

Pre-Push AI Code Review

Instead of pushing AI-generated code directly to your repository, imagine a review tool that runs locally before you even create the pull request. Using something like the Claude Agent SDK, you could hook into your git commit or pre-push workflow. Run a simple command like tool review, and it:

Analyses all changed code for potential bugs
Scans for accidentally committed secrets or sensitive data
Checks code quality and suggests improvements
Most importantly, determines if this PR should be split into smaller, more reviewable chunks

This isn't theoretical. The agent capabilities exist today to build this. You'd provide the tool with your full git diff, and it could intelligently group related changes, identify logical boundaries, and suggest how to split one massive PR into three or four coherent, reviewable pieces.

Intelligent PR Splitting

The splitting feature is crucial because it addresses the natural workflow of AI-assisted development. You want to work in larger blocks to maintain context and momentum. AI is genuinely faster when you can give it a bigger scope. But reviewers need smaller, focused PRs to do their job effectively.

Automated PR splitting solves this perfectly. You do your work, AI helps you be productive, then before pushing you automatically restructure it into review-friendly chunks. Each PR has a clear purpose, touches related files, and can be reviewed in 15-20 minutes instead of requiring an afternoon. This is particularly important when using AI for coding, where the temptation to keep generating more code is strong.

Smarter Code Review Assistance

This part is already happening. GitHub Copilot and Claude both offer PR review bots that can catch common issues, suggest improvements, and provide initial feedback. Combined with tools like GitPilotAI for automating commit messages, the combination of pre-push review, intelligent splitting, and assisted reviews creates a workflow where:

You write code quickly with AI assistance
AI pre-reviews and splits it locally
You push multiple focused PRs
Teammates get AI-assisted summaries and initial reviews
Human reviewers focus on architecture and business logic

The entire cycle becomes faster, not just the coding part.

What to Measure Instead

If you're trying to understand AI's impact on your team, stop measuring lines of code written or AI acceptance rates. Those metrics are useless. Here's what actually matters:

Time to production: How long from first commit to deployed code? This captures the entire SDLC, not just coding speed.

PR review time: Are PRs sitting longer waiting for review? If yes, you might have a large PR problem from AI-generated code.

Code churn rate: How often is code modified or reverted shortly after being written? High churn suggests AI is generating code that needs significant rework.

Team throughput: How many features or stories are you completing per sprint? This removes individual variation and shows actual delivery capability.

Quality metrics: Production bugs, security issues, and technical debt accumulation. Fast code generation means nothing if quality suffers.

The research on AI code quality shows concerning trends - copy-paste code increasing, code churn doubling. These second-order effects matter more than raw coding speed.

The Education Problem

There's another factor nobody talks about: most developers don't know how to code effectively with AI yet. It's not just about writing better prompts - though that matters. It's about understanding how to structure work for AI assistance, when to use it versus writing code yourself, and how to review AI-generated code critically. I've explored this shift in how AI has transformed my development workflow - moving from 5-15x faster development but requiring a completely different approach.

I've seen developers accept AI suggestions that introduce unnecessary complexity because the suggestion "looked smart". I've seen others waste time arguing with AI to generate code in a specific style when writing it themselves would be faster. The learning curve is real, and most organisations aren't investing in AI-assisted development training.

When you combine inexperienced AI usage with inadequate review processes, you get exactly what the research shows: developers who feel faster but ship features more slowly.

Reshaping the Development Cycle

The future of software development isn't just about writing code faster. It's about reimagining how we work when coding itself becomes almost instantaneous. That means:

Spending more time on architecture and design upfront
Investing in better review processes and tools
Building automation around AI-assisted workflows (like I've done with managing multiple Claude Code sessions)
Measuring outcomes, not activity
Training developers to be effective AI collaborators

The organisations that figure this out will see genuine productivity gains. Those that just add Copilot licenses and call it innovation will wonder why they're not seeing results.

What This Means Practically

If you're leading an engineering team, here's what to focus on:

Understand your actual bottlenecks: Track time from idea to production. Is it really coding, or is it review, planning, or coordination?
Match tools to your context: A three-person startup needs different processes than a 50-person team
Invest where it matters: If coding is your bottleneck, AI is transformative. If review is your bottleneck, focus there instead
Set appropriate constraints: PR size limits help larger teams but might slow down smaller ones
Watch for quality shifts: More code doesn't mean better code, but the impact varies by team size and process

If you're an individual developer:

Learn to prompt effectively: Getting good AI output requires practice, regardless of team size
Review AI suggestions critically: Just because it compiles doesn't mean it's correct
Match your PR size to your team: Small teams can handle larger PRs; bigger teams need smaller ones
Use tools that complement AI: Building your own productivity tools that work with AI can multiply effectiveness
Understand your full cycle: Know where time actually goes in your specific context

The Real AI Revolution

The AI revolution in software development isn't about replacing developers or even making coding infinitely faster. It's about understanding where AI fits in your specific development context and reshaping your processes accordingly.

For small teams with minimal overhead, AI code generation can genuinely double productivity. Ship it. For larger organisations, the gains exist but require more work to realise - you need to optimise the entire cycle, not just the coding step.

The tools for pre-push review, intelligent PR splitting, and comprehensive SDLC automation are within reach. Some already exist. Others just need someone to build them. The technical challenges aren't insurmountable - I know because I've built some of these systems.

What's holding us back is treating AI code generation as a silver bullet rather than one tool in a larger system. We need to be honest about where the bottlenecks actually are in our specific organisations, then build or adopt tools that address those bottlenecks.

The companies and developers who figure this out first will have a genuine competitive advantage. Not because they write code faster - everyone will have that soon enough - but because they've optimised their entire development cycle for an AI-assisted world. That optimisation looks different for a three-person startup than for a 500-person engineering organisation, and that's fine. The important thing is understanding your context and optimising for it.

Need help with your business?

Enjoyed this post? I help companies navigate AI implementation, fintech architecture, and technical strategy. Whether you're scaling engineering teams or building AI-powered products, I'd love to discuss your challenges.

Learn more about how I can support you.