Why Software Estimates Are Always Wrong (And How to Make Them Less Wrong)

Every developer who has worked on a client project has experienced this: you estimated 40 hours, the project took 80, the client is unhappy, and you’re not entirely sure what happened. You redo the post-mortem and find the same culprits as last time. Then you make the same mistakes on the next project.

This isn’t primarily a skill problem. It’s a process problem. Estimation is a skill, but without a process that forces you to make your assumptions explicit, skill alone doesn’t help much.

Why Estimates Go Wrong

Bad estimates fail in predictable ways. Understanding which failure mode is hitting your estimates is the first step to fixing them.

Optimism bias. People estimate how long things take when everything goes right. They don’t add time for when things go wrong. Things always go wrong. The fix isn’t to add a flat 20% buffer at the end. It’s to estimate the specific risks.

Scope creep the estimator didn’t notice. The estimate covered “user authentication” but the client assumed that included SSO, forgot to mention that the existing LDAP system needed integration, and expected a password-reset flow that wasn’t specified. None of these were in the estimate.

Unknown dependencies. The estimate assumed the third-party payment API would behave as documented. The third-party payment API’s sandbox was broken for two weeks.

Forgotten overhead. Code review takes time. Testing takes time. Deployment takes time. Team communication takes time. Many estimates are estimates of coding time, not project time.

The planning fallacy. Even when people know about planning fallacy and consciously try to correct for it, they don’t correct enough. Kahneman and Tversky documented this in the 1970s and nobody has found a reliable way to think your way out of it. Process is the answer, not better introspection.

The Reference Class Fix

The most reliable way to estimate a project is to find similar past projects and use those as your reference. This is called reference class forecasting.

“This project is like the e-commerce site we built in Q3 2025, which took 340 hours. This one has similar scope but a more complex product catalog. I’d expect 380-420 hours.”

This is less satisfying than bottoms-up estimation (breaking the project into tasks and summing hours) but it’s more accurate. Bottoms-up estimates miss things; reference class estimates capture the average of what missed things actually cost you.

The practical requirement: you need to track actual hours on past projects. If you don’t, start now. Even a rough log (“this project was 300 hours”) gives you something to work with within six months.

Bottoms-Up Estimation Without the Lies

Bottoms-up estimation isn’t wrong, it’s just frequently done badly. Done well, it’s a useful sanity check against your reference class estimate.

The technique: break the project into the smallest units you can clearly describe, estimate each unit, then add them up. The problem is that small units get estimated optimistically and the “glue” work between units gets omitted entirely.

A more honest bottoms-up estimate:

List every deliverable, not every task. User registration, user login, password reset, social login, admin user management, session handling. Each deliverable has implicit tasks: design, implementation, testing, review, bug fixes.
Give three estimates for each deliverable. Best case (nothing goes wrong), likely (your honest guess), worst case (external dependency breaks, requirements change, edge case discovered late). Use the average of these three, weighted slightly toward worst case.
Add overhead explicitly. Project management, client communication, code review, deployment work, and documentation. These are typically 20-30% of development time. Add them as a separate line item rather than hoping they’ll fit in your per-deliverable estimates.
Identify the unknowns. Every project has things you don’t know yet. “We don’t know what the existing payment integration looks like” or “we haven’t seen the design files yet.” Add a contingency for each unknown.
Don’t round down. When you’re looking at a range of 180-220 hours and you’re tempted to quote 180 because 220 feels high, quote 220. The discomfort you feel quoting the honest number is a signal, not a reason to change it.

T-Shirt Sizing for Early Estimates

Sometimes a client asks for an estimate before you have enough information to be precise. T-shirt sizes give you a way to answer without pretending to a precision you don’t have.

Size	Hour range	What it means
XS	Under 20 hours	A defined task with no unknowns
S	20-60 hours	A small feature or well-defined module
M	60-150 hours	A medium feature with some unknowns
L	150-400 hours	A significant chunk of work; needs more discovery
XL	Over 400 hours	Needs to be broken down before estimating

T-shirt sizes are useful for roadmap planning and early conversations. They’re not a substitute for a detailed estimate before signing a contract.

The mistake teams make: giving a T-shirt size in a sales conversation and then never revisiting it. “You said it would be a medium” becomes a commitment the client holds you to. Either size more conservatively in sales conversations or be explicit that the size will be revised once you have full requirements.

Splitting the Discovery Phase

Many estimation problems come from estimating implementation before understanding what’s being built. The fix is a paid discovery phase.

Discovery is 2-5 days of structured work: reviewing requirements, interviewing stakeholders, mapping out the technical architecture, identifying external dependencies, and writing a detailed spec or scope document. At the end, you have enough information to give a reliable estimate.

The discovery phase output is a fixed-scope deliverable (the spec), not a commitment to build the project at a specific price. After discovery, both parties know enough to sign a realistic contract.

Clients sometimes push back on paying for discovery. The reframe: discovery is what separates a reliable estimate from a guess. If you’re billing fixed-price, your estimate risk is real. You absorb the overruns. Discovery reduces that risk. If the client won’t pay for discovery, you’re being asked to absorb estimation risk without the information needed to estimate accurately.

Communicating Uncertainty to Clients

Clients often ask for a single number. “How much will this cost?” The truthful answer is a range, but ranges make clients uncomfortable.

How to present a range without losing the business:

Anchor on the likely case, disclose the range. “We estimate this at around $45,000. Depending on what we find during integration with your existing systems, it could run $40,000-$55,000. I can give you a tighter number after we’ve done the discovery work.”

Separate what’s fixed from what’s variable. “The authentication and user management work is well-defined; we can commit to a fixed price of $18,000 for that. The reporting module depends on what data is available from the API; that’s estimated at $8,000-$14,000.”

Explain what would cause the high end. Clients hear “worst case” as “this is what you’ll pay.” They hear “this is what happens if X and Y” as a contingency plan they might not need. Frame the range around specific scenarios, not abstract risk.

The Most Underused Tool: The Change Request

The biggest source of budget overruns on fixed-price projects isn’t bad estimation. It’s scope change that isn’t priced. Requirements change. Clients remember things they forgot to mention. Designs evolve. New stakeholders have opinions.

Every change to the agreed scope should trigger a change request: a short document that describes the change, the estimated additional cost, and requires client approval before work begins.

This protects both sides. The client gets visibility into what changes cost. The agency doesn’t absorb scope changes silently and then bill the client for overruns at the end of the project.

Teams that use change requests consistently have fewer disputes at project end than teams that absorb scope changes and hope the project comes in under budget.

The Number That Actually Matters

Accuracy of a single estimate is less important than accuracy over many estimates. If you’re right on average (sometimes coming in under, sometimes over), you’re doing it correctly. Perfect estimates are not the goal.

Track actuals against estimates on every project. Not to blame anyone, but to understand your team’s systematic biases. Do you underestimate frontend work? Overestimate integration complexity? Are your estimates for React projects more accurate than estimates for Python projects?

The teams that get better at estimation over time are the ones that close the feedback loop.

Why Software Estimates Are Always Wrong (And How to Make Them Less Wrong)

Why Estimates Go Wrong

The Reference Class Fix

Bottoms-Up Estimation Without the Lies

T-Shirt Sizing for Early Estimates

Splitting the Discovery Phase

Communicating Uncertainty to Clients

The Most Underused Tool: The Change Request

The Number That Actually Matters

Internationalizing a React App in 2026: react-i18next, next-intl, and Lingui Compared

Client Handoff Documentation That Gets Read After Launch

More from Business

Sprint Planning for Agency Client Projects: An Honest Adaptation of Agile

Agency SLAs and Support Contracts: What Ongoing Work Actually Looks Like

Maintenance Pricing for AI-Assisted Projects: What's Different in 2026

Join the conversation.