Claude Code Changed How We Ship Software — Our First 90 Days

I did not believe AI coding tools would change much. I had tried GitHub Copilot when it launched, used ChatGPT for coding questions, experimented with Cursor. They were all useful in small ways -- autocomplete on steroids, a better Stack Overflow. But they did not change how we fundamentally worked. We still designed the same way, debugged the same way, reviewed code the same way.

Then we started using Claude Code across our entire team at CODERCOPS. And after 90 days, I have to admit: I was wrong. But not in the way you would expect. Claude Code did not replace any of our engineers. It did not even reduce the amount of thinking required. What it did was compress the time between "I know what to build" and "it is built and tested." And that compression changed everything about how we operate.

Let me walk you through the actual data -- the good, the bad, and the stuff nobody talks about in the breathless "AI will replace developers" discourse. This is a real report from a real 7-person engineering team at a real agency serving real clients.

The Setup: Who We Are and What We Measured

CODERCOPS is an AI-first tech agency based in India. We have 7 engineers. We work on a mix of client projects: web applications (mostly Astro, Next.js, and React), AI integrations (RAG systems, agent frameworks, custom ML pipelines), and infrastructure work (Supabase, AWS, Vercel deployments).

We adopted Claude Code -- Anthropic's CLI-based AI coding tool -- across all projects on December 1, 2025. Before adoption, we established baseline metrics for November 2025. We then tracked the same metrics for January and February 2026 (skipping December as the adoption month).

What we measured:

Sprint velocity (story points completed per 2-week sprint)
PR merge time (time from PR opened to merged)
Bug resolution time (time from bug reported to fix deployed)
Test coverage (percentage of codebase covered by tests)
Lines of code per PR (as a complexity indicator, not a productivity metric)
API costs (what we spent on Claude Code)
Developer satisfaction (anonymous weekly survey, 1-10 scale)

The Numbers: Before and After

Here are the raw numbers. I am not going to cherry-pick.

Sprint Velocity

Period	Avg Story Points / Sprint	Change
November 2025 (baseline)	47	--
January 2026	62	+32%
February 2026	68	+45%

My honest take: The 45% increase is real, but context matters. Part of this is the team getting better at using Claude Code (January was lower than February as people learned). Part of it is that Claude Code is particularly good at the kind of work we do -- a lot of CRUD operations, API integrations, and content-driven sites. A team doing low-level systems programming would see different numbers.

PR Merge Time

Period	Avg Time to Merge	Change
November 2025	6.2 hours	--
January 2026	3.8 hours	-39%
February 2026	3.1 hours	-50%

Why this matters: Faster PR merges mean shorter feedback loops. The biggest contributor was not faster code writing -- it was Claude Code generating much more complete PRs. Before, a developer might submit a PR and get 3 rounds of review comments. Now, Claude Code catches many of those issues before the PR is even opened, because we put our coding standards in CLAUDE.md files and Claude enforces them automatically.

Bug Resolution Time

Period	Avg Resolution Time	Change
November 2025	4.1 hours	--
January 2026	2.3 hours	-44%
February 2026	1.8 hours	-56%

This is the metric I find most impressive. Debugging is where Claude Code shines the brightest. You point it at a bug report, it reads the relevant code, forms a hypothesis, checks it, and often finds the root cause in minutes. Things that would take an engineer an hour of "read the stack trace, search the codebase, add console.log statements, reproduce, narrow down" Claude Code does in 2-5 minutes.

Test Coverage

Period	Test Coverage	Change
November 2025	34%	--
January 2026	51%	+17 points
February 2026	62%	+28 points

This is the sleeper hit. Before Claude Code, writing tests was the task everyone deprioritized. "We'll add tests later" was our unofficial motto. With Claude Code, generating comprehensive test suites is so fast that there is no excuse to skip them. We now include test generation in our definition of done for every PR. Claude Code writes the test skeleton, the developer reviews and adjusts, and we ship with tests. It takes 10 minutes instead of an hour.

What Claude Code Is Surprisingly Good At

Understanding Large Codebases

This is Claude Code's killer feature and the one that surprised me most. You can point it at a codebase with hundreds of files and ask "how does the authentication flow work?" and it will read the relevant files, trace the flow, and give you an accurate explanation.

For client projects where we are taking over an existing codebase, this capability cuts onboarding time by 60-70%. Instead of spending a week reading code and asking the previous team questions, a new engineer can pair with Claude Code to map out the architecture in a day.

Real example: We inherited a Next.js application with 340 files, no documentation, and the previous team was not available for questions. Our senior engineer paired with Claude Code and had a complete architecture map, identified the 3 major technical debt items, and started the first refactoring PR within 2 days. Previously, this would have taken 1-2 weeks.

Multi-File Refactors

Renaming a function that is used in 47 files. Changing an API response format that affects every component consuming it. Migrating from one database client to another. These are the tasks that take hours of careful, tedious work and are perfect for Claude Code.

We recently migrated a client project from the Astro content collections API to a Supabase-backed content system. It touched 30+ files -- content utility functions, page components, API routes, type definitions. Claude Code did the migration across all files in one session, maintaining type safety throughout. It took about 45 minutes of engineer time (mostly reviewing the changes) instead of the 6-8 hours we had estimated.

Writing Tests

I mentioned this in the metrics section, but it deserves its own callout. Claude Code is exceptionally good at reading a function and generating comprehensive tests -- including edge cases that a human might miss.

// We give Claude Code a function like this:
export function parseSubscriptionTier(
  input: string | null | undefined
): SubscriptionTier {
  // ... implementation
}

// And it generates tests covering:
// - Valid inputs ("free", "pro", "enterprise")
// - Case insensitivity ("FREE", "Pro", "ENTERPRISE")
// - Null and undefined inputs
// - Empty string
// - Invalid strings ("premium", "gold")
// - Whitespace handling ("  pro  ")
// - Type coercion edge cases

The test quality is genuinely good. Not perfect -- about 80% of generated tests are correct and useful. The other 20% need adjustment, usually because Claude Code does not fully understand the business context behind edge case behavior.

Explaining Legacy Code

"What does this function do? Why does it exist? What would break if I changed it?" These questions used to require finding someone who wrote the code (and hoping they remember). Now we ask Claude Code, and it gives remarkably accurate explanations, including identifying the likely bugs and design compromises in the original implementation.

What Claude Code Is Bad At (The Honest Part)

Complex Architectural Decisions

Claude Code can implement any architecture you describe. But it should not choose the architecture. When we asked it "should we use a monorepo or separate repos for this microservices project?" it gave a perfectly balanced, non-committal answer that was technically correct but useless for making a decision.

Architectural decisions require understanding business context, team capabilities, timeline constraints, and future roadmap -- things that Claude Code does not have visibility into. We use it to implement architectural decisions, not make them.

Creative UI/UX Work

For building forms, data tables, and standard UI patterns, Claude Code is excellent. For creating a unique, creative user experience that differentiates a product? Not so much. It generates competent, derivative designs. It is the engineering equivalent of "corporate art style" -- technically fine, creatively bland.

We still have our designers create the creative direction and UI concepts. Claude Code then helps implement those designs accurately and quickly.

Understanding Business Context Without CLAUDE.md

This was our biggest early frustration. Claude Code would write technically correct code that was wrong for the business context. For example, it would implement a standard "delete" operation when our client's compliance requirements meant we needed "soft delete with 90-day retention." It would use standard error messages when the client had specific error messaging guidelines.

The fix was the CLAUDE.md pattern, which I will cover in the next section.

Debugging Across System Boundaries

Claude Code is great at debugging within a single codebase. But when the bug involves the interaction between your frontend, your API, a third-party service, and a database? It struggles. It can not see the full picture across system boundaries. It often suggests fixes that are correct for the code it can see but miss the actual root cause in a system it cannot observe.

For cross-system debugging, we still rely on human engineers with observability tools (logs, traces, metrics dashboards).

The CLAUDE.md Pattern: Our Biggest Unlock

If I had to pick one thing that made the biggest difference in our Claude Code adoption, it is the CLAUDE.md file. This is a project-level file that gives Claude Code persistent instructions about your project.

Here is a simplified version of what we include in every project:

# CLAUDE.md

## Project Overview
This is a Next.js 15 e-commerce platform for [Client].
The app uses the App Router, Server Components,
and Supabase for the database.

## Architecture Decisions
- All database queries go through the data access layer
  in `src/lib/data/`
- Never query the database directly from components
  or API routes
- Use Server Components by default; Client Components
  only when interactivity is needed
- All client-side state uses Zustand stores in `src/stores/`

## Coding Standards
- TypeScript strict mode -- no `any` types
- All functions must have JSDoc comments
- Error handling: use the custom `AppError` class
  from `src/lib/errors.ts`
- API responses use the `ApiResponse<T>` type
  from `src/types/api.ts`
- Tests: Vitest for unit tests, Playwright for e2e
- File naming: kebab-case for files, PascalCase for
  components

## Business Rules (IMPORTANT)
- Users are never hard-deleted. Always use soft delete
  via `deleted_at` timestamp
- Prices are stored in cents (integer), displayed
  in dollars (formatted)
- All user-facing error messages must come from the
  `ERROR_MESSAGES` constant in `src/constants/errors.ts`
- PII must never be logged. Use the `sanitizeForLog()`
  utility

## Common Patterns
When creating a new API route:
1. Add the route in `src/app/api/[resource]/route.ts`
2. Add the data access function in `src/lib/data/[resource].ts`
3. Add the Zod schema in `src/schemas/[resource].ts`
4. Add tests in `src/tests/api/[resource].test.ts`

## Known Issues
- The payment webhook handler has a race condition
  (tracked in issue #142) -- do not modify without
  discussing with the team
- The image upload component has a memory leak on
  unmount -- use the `useImageUpload` hook instead
  of implementing directly

The impact was dramatic. Before CLAUDE.md, about 30% of Claude Code's output needed correction for project-specific conventions. After implementing CLAUDE.md files across all projects, that dropped to under 10%.

Key insight: Claude Code is not psychic. It does not know your team's conventions, your client's requirements, or your architectural decisions unless you tell it. The CLAUDE.md file is how you tell it. Invest time in making it thorough and keep it updated.

Workflow Integration: Hooks, MCP Servers, and Custom Skills

Beyond the CLAUDE.md file, we have built several integrations that make Claude Code more effective:

Git Hooks

We use Claude Code hooks to enforce quality before commits:

// .claude/settings.json
{
  "hooks": {
    "pre-commit": {
      "command": "npm run lint && npm run typecheck",
      "description": "Run linting and type checking before commits"
    }
  }
}

This catches issues that Claude Code itself might introduce -- TypeScript errors, lint violations, formatting inconsistencies.

MCP Servers

We connect Claude Code to our project management and monitoring tools via MCP (Model Context Protocol):

Supabase MCP server: Claude Code can read database schemas, query data, and apply migrations directly. This is a game-changer for data-layer work.
Linear MCP server: Claude Code can read issue descriptions, update ticket status, and link PRs to issues. When debugging, it can read the full bug report and related tickets for context.
Sentry MCP server: For debugging production issues, Claude Code can pull error traces and affected user sessions directly.

The "Read Issue, Fix Bug, Submit PR" Workflow

Here is our most productive workflow. It takes about 15 minutes for bugs that used to take 1-2 hours:

Developer sees a bug ticket in Linear
Opens Claude Code in the project directory
Says: "Read Linear issue ABC-123 and fix the bug described"
Claude Code reads the issue via MCP, reads the relevant code, identifies the bug, implements the fix, writes a test
Developer reviews the changes, adjusts if needed
Claude Code creates the commit and PR with the Linear issue linked

This is not magic or exaggeration. It works about 70% of the time for well-described bugs. The other 30% need more human intervention, usually because the bug involves external systems or the issue description is vague.

Team Adoption: The Human Side

Not everyone on the team embraced Claude Code immediately. Here is what happened:

The Enthusiasts (3 engineers)

Jumped in immediately, started using Claude Code for everything, saw productivity gains within the first week. These were generally our more senior engineers who knew exactly what they wanted built and could review Claude Code's output effectively.

The Cautious Adopters (2 engineers)

Used Claude Code for specific tasks -- writing tests, generating boilerplate, explaining unfamiliar code -- but continued doing core development work manually. Over 6-8 weeks, they gradually expanded their usage as they built trust in the tool.

The Skeptics (2 engineers)

Resisted adoption for the first month. Their concerns were legitimate: "What if I become dependent on it and lose my skills?" and "I need to understand the code deeply, not just review AI output."

We addressed this by reframing Claude Code as a tool, not a replacement. Like how a carpenter uses a power drill instead of a manual screwdriver -- you still need to know where to drill and why, but the tool makes the actual drilling faster.

By month 3, both skeptics were using Claude Code regularly, though less extensively than the enthusiasts. And their caution actually made them better reviewers of Claude Code output.

The "Over-Reliance" Problem

This is real and we have to actively manage it. Around week 6, we noticed some engineers were accepting Claude Code output without thorough review. The code worked and passed tests, but it was not always the best approach.

Our rules now:

Every Claude Code output must be reviewed as carefully as you would review a junior developer's PR
If you can not explain why the code works, you have not reviewed it properly
Architects and senior engineers make design decisions before Claude Code implements them
We do weekly code review sessions where the team discusses Claude Code's output patterns -- both good and bad

The Cost Analysis

Let's talk money. Claude Code is not free.

API Costs

Month	API Spend	Engineers Using	Cost Per Engineer
December 2025 (adoption)	$420	5	$84
January 2026	$680	7	$97
February 2026	$890	7	$127

The increasing cost reflects increasing usage as the team got more comfortable and found more use cases. Our February run rate is about $10,700/year in API costs.

ROI Calculation

Average fully-loaded engineer cost at CODERCOPS: ~$35/hour (India rates).

Hours saved per engineer per week (estimated based on velocity increase): 6-8 hours.

Monthly time savings: 7 engineers x 7 hours x 4 weeks = 196 hours.

Monthly value of saved time: 196 x $35 = $6,860.

Monthly Claude Code cost: $890.

ROI: 7.7x return. For every dollar we spend on Claude Code, we get approximately $7.70 in productive time back.

Even if you halve our estimate of time saved (to account for optimism bias), the ROI is still 3.8x. The tool pays for itself many times over.

The Hidden Cost: Context Switching

One cost that does not show up in API bills: Claude Code occasionally sends you down the wrong path. You spend 20 minutes implementing something based on Claude Code's suggestion, realize it is wrong, and have to back up. This happens maybe once or twice a week. Factor in perhaps 1-2 hours per week per engineer in wasted effort from following incorrect suggestions.

After accounting for this, our net time savings per engineer drops from 7 hours to about 5-6 hours per week. Still very positive, but honesty requires acknowledging it.

The Verdict After 90 Days

Here is my straight assessment:

Claude Code is not a replacement for engineers. Anyone who tells you otherwise is selling something. It cannot design systems, make product decisions, understand user needs, navigate organizational politics, or handle the thousand other non-coding tasks that software engineering involves.

Claude Code is a multiplier. And like any multiplier, it amplifies what is already there. A strong engineer becomes more productive. A weak engineer produces more code, but not necessarily better software. The best engineers on our team benefit the most because they have the judgment to direct Claude Code effectively and the expertise to review its output critically.

The biggest impact is not on coding speed. It is on the tasks that used to be tedious enough to skip: writing tests, documenting code, doing thorough refactors, investigating edge cases. These tasks now get done because they are fast enough that there is no excuse to skip them.

The team is better off. Not because Claude Code is brilliant (it is not -- it is a very capable but imperfect tool), but because it removed the friction that prevented us from doing the things we already knew we should be doing.

My Recommendations for Teams Considering Adoption

Start with CLAUDE.md. Invest 2-3 hours writing a comprehensive project context file before anyone touches Claude Code. This is the single highest-ROI action you can take.
Set review standards early. Make it clear that AI-generated code gets the same (or more) scrutiny as human-written code.
Let adoption happen naturally. Do not force it. Let enthusiasts pioneer, let others observe and join when ready. Forced adoption breeds resentment and sloppy usage.
Track metrics. You cannot improve what you do not measure. Establish baselines before adoption and track honestly afterward.
Budget for API costs. At current pricing, expect $80-150 per engineer per month for meaningful usage. Build this into your project costs.
Pair it with good tooling. Claude Code plus MCP servers plus git hooks plus CI/CD is much more powerful than Claude Code alone.

What Is Next for Us

We are now exploring:

Custom MCP servers for our specific client workflows
Claude Code in CI pipelines for automated code review on every PR
Training the team on advanced prompting -- there is a meaningful skill gap between "use Claude Code" and "use Claude Code effectively"
Benchmarking against other tools -- we plan to do a head-to-head comparison with Cursor and GitHub Copilot Workspace in Q2 2026

If you are an engineering team curious about AI coding tools, or you are a business wondering whether this technology actually delivers on its promises, I hope this report gives you real data to make decisions with.

Want Help Integrating AI Into Your Engineering Workflow?

At CODERCOPS, we do not just use AI tools -- we help other teams adopt them effectively. Whether you need help setting up Claude Code for your team, building custom MCP integrations, or designing an AI-augmented engineering workflow, we can help.

We also build AI-powered applications for clients who want to bring these capabilities to their users. Check out our blog for more deep dives on AI engineering, or get in touch to discuss your project.

Claude Code Changed How We Ship Software — Our First 90 Days

The Setup: Who We Are and What We Measured

The Numbers: Before and After

Sprint Velocity

PR Merge Time

Bug Resolution Time

Test Coverage

What Claude Code Is Surprisingly Good At

Understanding Large Codebases

Multi-File Refactors

Writing Tests

Explaining Legacy Code

What Claude Code Is Bad At (The Honest Part)

Complex Architectural Decisions

Creative UI/UX Work

Understanding Business Context Without CLAUDE.md

Debugging Across System Boundaries

The CLAUDE.md Pattern: Our Biggest Unlock

Workflow Integration: Hooks, MCP Servers, and Custom Skills

Git Hooks

MCP Servers

The "Read Issue, Fix Bug, Submit PR" Workflow

Team Adoption: The Human Side

The Enthusiasts (3 engineers)

The Cautious Adopters (2 engineers)

The Skeptics (2 engineers)

The "Over-Reliance" Problem

The Cost Analysis

API Costs

ROI Calculation

The Hidden Cost: Context Switching

The Verdict After 90 Days

My Recommendations for Teams Considering Adoption

What Is Next for Us

Want Help Integrating AI Into Your Engineering Workflow?

Comments

On this page

The Setup: Who We Are and What We Measured

The Numbers: Before and After

Sprint Velocity

PR Merge Time

Bug Resolution Time

Test Coverage

What Claude Code Is Surprisingly Good At

Understanding Large Codebases

Multi-File Refactors

Writing Tests

Explaining Legacy Code

What Claude Code Is Bad At (The Honest Part)

Complex Architectural Decisions

Creative UI/UX Work

Understanding Business Context Without CLAUDE.md

Debugging Across System Boundaries

The CLAUDE.md Pattern: Our Biggest Unlock

Workflow Integration: Hooks, MCP Servers, and Custom Skills

Git Hooks

MCP Servers

The "Read Issue, Fix Bug, Submit PR" Workflow

Team Adoption: The Human Side

The Enthusiasts (3 engineers)

The Cautious Adopters (2 engineers)

The Skeptics (2 engineers)

The "Over-Reliance" Problem

The Cost Analysis

API Costs

ROI Calculation

The Hidden Cost: Context Switching

The Verdict After 90 Days

My Recommendations for Teams Considering Adoption

What Is Next for Us

Want Help Integrating AI Into Your Engineering Workflow?

Comments

Related Posts More from AI Integration

Why We Chose to Be an AI-First Agency (Not Just an Agency That Uses AI)

Django as Your AI Backend -- Serving ML Models Without the Microservices Tax

Agentic AI Hit the Trough of Disillusionment — And That's the Best Thing That Could Have Happened

Stay in the loop

On this page