Skip to content

Web Development · Testing

Test-Driven Development With AI Coding Assistants: Does TDD Still Make Sense in 2026?

AI tools write code fast. TDD asks you to slow down and write tests first. These two impulses seem to be in tension. Here's how they actually work together.

Anurag Verma

Anurag Verma

8 min read

Test-Driven Development With AI Coding Assistants: Does TDD Still Make Sense in 2026?

Sponsored

Share

Test-driven development has always been a practice more people say they do than actually do. The red-green-refactor loop is well-understood. The benefits are real: better-designed code, fast feedback, a test suite you can trust. But writing the test first, before you’re sure what the implementation will look like, requires discipline that erodes under deadline pressure.

AI coding assistants changed this dynamic in a way that’s not obvious at first. On one hand, they generate code so quickly that TDD’s “slow down to think” value proposition seems weaker. On the other hand, they turn writing tests into something fast enough that the overhead excuse disappears.

The result is that TDD works differently with AI assistance, and for most codebases, works better.

Why TDD Got Harder to Skip

Before AI assistants, the excuse for skipping TDD was usually time. Writing a test before implementation felt like extra work on a task you already understood. You knew what the code needed to do; writing a test first added 10 minutes to something that would take 20.

With AI-generated code, the incentive flipped. You can now generate a function in 10 seconds. The problem: AI-generated code is often plausible-looking but subtly wrong. It handles the happy path and misses edge cases. It passes the obvious cases and fails on real data. And because you generated it quickly, you’re tempted to ship it quickly.

TDD with AI assistance looks like this:

  1. You write the test first, describing the behavior you need
  2. You ask the AI to implement the code that makes the test pass
  3. The AI generates something; you run the tests
  4. If it fails, you have a specific failure to show the AI
  5. You iterate until tests pass, then look for refactoring opportunities

The test is now a specification for the AI, not just a verification. The AI is better at generating code when you give it a test to satisfy than when you describe behavior in prose.

The Practical Workflow

Here’s what this looks like in practice for a backend function. Suppose you’re building a function to calculate tiered pricing for a SaaS product.

Step 1: Write the test that defines the contract

// src/billing/pricing.test.ts
import { calculateTieredPrice } from './pricing';

describe('calculateTieredPrice', () => {
  it('returns the base rate for usage within the first tier', () => {
    expect(calculateTieredPrice(500)).toBe(49.00);
  });

  it('adds per-unit cost for usage above the first tier', () => {
    // Base: $49 for up to 1000 units
    // $0.05 per unit above 1000
    expect(calculateTieredPrice(1500)).toBe(49 + (500 * 0.05));
  });

  it('applies the second tier rate above 5000 units', () => {
    // $0.03 per unit above 5000 (cheaper at scale)
    expect(calculateTieredPrice(6000)).toBe(
      49 + (4000 * 0.05) + (1000 * 0.03)
    );
  });

  it('returns 0 for zero usage', () => {
    expect(calculateTieredPrice(0)).toBe(0);
  });

  it('throws for negative usage', () => {
    expect(() => calculateTieredPrice(-1)).toThrow('Usage cannot be negative');
  });
});

Step 2: Give the tests to your AI assistant

The prompt: “These tests define a tiered pricing function. Implement calculateTieredPrice so all tests pass. Don’t modify the tests.”

The key constraint is “don’t modify the tests.” Without this, AI assistants sometimes take the path of least resistance and weaken the tests to make the implementation pass.

Step 3: Run the tests

The AI will generate something. Run the tests immediately. If they fail, paste the test output back:

FAIL src/billing/pricing.test.ts
  ✗ applies the second tier rate above 5000 units
    Expected: 269
    Received: 269.5

Feed this back to the AI: “Test is failing with this output. Here’s the current implementation: [paste code]. Fix it so the test passes.”

Step 4: Refactor with confidence

Once tests pass, refactor. The tests are your safety net. You can improve the implementation without worrying about breaking behavior, because you have specific cases that define what “working” means.

Where TDD Pays Off With AI Code

Catching “plausible but wrong” outputs

AI assistants are very good at generating code that looks right. The subtle mistakes are the dangerous ones: an off-by-one error in a date range calculation, a rounding error in a financial calculation, a missing case in a state machine.

Tests catch these before they reach production. Without tests, you’re relying on code review and manual testing to catch issues that a well-written test suite would catch in 50 milliseconds.

Complex domain logic

For CRUD operations and simple transformations, the AI gets it right consistently and TDD’s value is lower. For complex business logic (billing calculations, authorization rules, data transformation pipelines), TDD becomes essential.

The rule of thumb: if the logic has more than 3 branching paths, write the tests first. The tests force you to enumerate the paths before you (or the AI) starts implementing them.

Boundary conditions

AI assistants are reliably bad at edge cases. They implement the happy path and often skip:

  • Empty inputs
  • Maximum values
  • Null and undefined handling
  • Concurrent modification cases
  • Off-by-one in ranges

Writing tests for boundary conditions before implementation forces you to think about them explicitly. The AI then has explicit targets for those cases.

Writing Tests for AI-Generated Code You Already Have

Not every team will start fresh with TDD. A more common situation: you have a codebase with sparse tests and you’re now using AI assistants to add features. You want to improve test coverage without slowing down.

The hybrid approach:

  1. Before adding a feature, write tests for the existing behavior the new code will interact with
  2. Have the AI generate the new feature
  3. Write tests for the new feature immediately after generation, before moving on

This is “test-after, but immediately after.” It’s not pure TDD, but it captures most of the benefit: you’re writing tests while the logic is fresh and before moving to the next task.

The AI is also useful here. Prompt: “Given this implementation, write tests that cover the main behaviors, edge cases, and error conditions. Use [Vitest/Jest/pytest] syntax.”

The AI will generate a reasonable test suite. Review it, add cases you know are missing from your domain knowledge, and commit it with the implementation.

Framework Specifics

JavaScript/TypeScript with Vitest

Vitest is the best testing setup for modern TypeScript projects. The setup is minimal:

// vitest.config.ts
import { defineConfig } from 'vitest/config';

export default defineConfig({
  test: {
    environment: 'node',
    coverage: {
      provider: 'v8',
      reporter: ['text', 'json', 'html'],
    },
  },
});

Run in watch mode while working: vitest --watch. The feedback loop is fast enough to make TDD natural.

Python with pytest

pytest with parameterize handles the “test all the cases” pattern cleanly:

# test_pricing.py
import pytest
from billing.pricing import calculate_tiered_price

@pytest.mark.parametrize("usage, expected", [
    (0, 0),
    (500, 49.00),
    (1000, 49.00),
    (1500, 49 + (500 * 0.05)),
    (6000, 49 + (4000 * 0.05) + (1000 * 0.03)),
])
def test_tiered_pricing(usage, expected):
    assert calculate_tiered_price(usage) == pytest.approx(expected)

def test_negative_usage_raises():
    with pytest.raises(ValueError, match="Usage cannot be negative"):
        calculate_tiered_price(-1)

Parameterized tests are concise enough that adding cases has almost no overhead. When you think of a new edge case, adding it is one line.

What TDD Does Not Fix

TDD does not help with:

  • Unclear requirements. If you don’t know what the function should do, you can’t write a meaningful test for it. TDD forces clarity on behavior, but it can’t create clarity that isn’t there.
  • Integration behavior. Unit tests tell you that individual pieces work. They don’t tell you that the pieces work together in production conditions. Integration and end-to-end tests cover this, but they’re a different practice.
  • Performance. A test suite can verify correctness without verifying that a query returns in under 100ms or that memory usage stays bounded. Performance testing requires different tooling.
  • AI-generated test coverage gaps. If you have the AI write your tests, it may miss cases it doesn’t know about from the problem domain. Review AI-generated test suites critically. Domain knowledge gaps in the AI show up as missing test cases more reliably than as wrong implementations.

A Realistic Assessment

Most developers will not do strict red-green-refactor TDD for every line of code, even with AI assistance. The discipline required is real and the cognitive overhead of writing tests first is higher than writing code first.

The more honest goal: write tests before the code that matters most. Business logic, data transformations, billing calculations, authorization rules. These are the places where bugs cause real problems and where TDD’s design-forcing function is most useful.

For utility functions, configuration parsing, and straightforward CRUD operations, write tests after generation, immediately, before the context switches. That’s good enough and achievable consistently.

The underlying point is that AI-generated code needs tests more than hand-written code, not less. Speed creates a false confidence. Tests are the check on that confidence.

Sponsored

Enjoyed it? Pass it on.

Share this article.

Sponsored

The dispatch

Working notes from
the studio.

A short letter twice a month — what we shipped, what broke, and the AI tools earning their keep.

No spam, ever. Unsubscribe anytime.

Discussion

Join the conversation.

Comments are powered by GitHub Discussions. Sign in with your GitHub account to leave a comment.

Sponsored