Building Scalable Applications: Lessons from the Trenches

August 22, 2024

by Chelsea Hagon, Technical Director

Scale Isn't Just About Traffic

When clients say "we need this to scale," they usually mean handling lots of users. But we've learned that's only one dimension of scale.

Last year, we built a platform for a growing startup. They expected 10,000 users in year one, maybe 50,000 in year two. We built for that. What we didn't anticipate? Their business model changed three times. Features were added, removed, and reimagined. The product team doubled.

The system handled the user growth fine. But we spent months refactoring because we hadn't built for complexity scale—the ability to evolve as the business learns what it's actually building.

True scalability means handling growth in users, data, features, team size, and business model changes. All at once. That requires different thinking than just "make it fast."

The Database Is Usually the Bottleneck

We've diagnosed performance issues across dozens of applications. Want to know the most common culprit? Database queries that made sense at 1,000 users but fall apart at 50,000.

A fintech client came to us with a dashboard taking 40 seconds to load. The code was fine. The infrastructure was fine. But there were 18 separate database queries, some hitting tables with millions of rows and no indexes.

We added proper indexes, combined queries where possible, and introduced strategic caching. Load time dropped to 800ms. No infrastructure changes, no code rewrites—just understanding how databases actually work under load.

If you're building something that needs to scale, invest time in understanding your data access patterns early. N+1 queries are cute at 100 users. At 100,000 users, they're existential.

Caching Is Hard, But Worth It

Everyone knows caching makes things fast. What they don't tell you is that cache invalidation is genuinely one of the hardest problems in computer science.

We built a content platform where articles could be updated, but those updates needed to show everywhere immediately. Simple, right? Just invalidate the cache on update.

Except articles have related articles. And category pages. And author pages. And search indexes. And API responses. One update cascaded into invalidating dozens of cache keys across multiple systems.

We learned to think about caching in layers:

Short-lived cache (seconds) for rapidly changing data
Medium-lived cache (minutes) for frequently accessed, occasionally updated data
Long-lived cache (hours/days) for static or rarely changing content

And critically: we built invalidation logic from day one. Fast caching with stale data is worse than no caching at all.

Build for Observability, Not Just Functionality

Here's a mistake we've made repeatedly: building great features but terrible observability. When things break at scale, "it's slow" or "it's not working" doesn't cut it.

Now we instrument everything from the start:

Performance metrics for every endpoint
Database query timing and frequency
Error rates and types
User behavior flows
System resource utilization

A health monitoring platform we built had intermittent issues we couldn't reproduce locally. But our logs showed a pattern: every failure happened exactly 30 seconds after a specific API call. Turns out, a third-party service had a 30-second timeout we didn't know about.

Without good observability, we'd still be guessing. With it, we found and fixed the issue in an hour.

Vertical Scale Before Horizontal Scale

The industry loves talking about horizontal scaling—just add more servers! But that comes with real complexity: load balancing, session management, distributed state, deployment coordination.

We've found it's almost always better to scale vertically first. A beefier server is simpler than a cluster of smaller ones. Modern cloud instances are powerful—you can get surprisingly far on a single well-configured machine.

One of our most successful projects runs on a single beefy server with good caching and database optimization. It handles 200,000 daily users without breaking a sweat. Could we have built a distributed system? Sure. Would it have been better? Probably not.

Scale horizontally when vertical scaling becomes prohibitively expensive or you need redundancy. Not before.

Async Processing Is Your Friend

The fastest way to make something feel fast is to not do the work immediately.

A document processing platform we built initially tried to do everything synchronously: upload file, scan for viruses, extract text, generate thumbnails, update search index. Users waited 30 seconds watching a loading spinner.

We split it into async jobs. Upload returns immediately. Processing happens in the background with status updates. Perceived performance went from "painfully slow" to "instant," even though the actual work takes the same time.

Users don't mind waiting for things to process. They mind being blocked while watching a spinner. If something doesn't need to happen in the request/response cycle, don't make it.

Plan for Failure

Systems fail. Servers crash, APIs timeout, databases lock up, networks hiccup. Scalable applications assume failure and handle it gracefully.

We built a booking system where payment processing occasionally failed due to network issues. Initially, the whole transaction would roll back—terrible user experience.

We rebuilt it to be resilient:

Retry logic with exponential backoff
Idempotent operations so retries are safe
Graceful degradation when services are down
Clear error messages and recovery paths

Now when payment processing hiccups, the system retries automatically, or queues it for later, or gives the user clear options to retry. The system scales better not because it's faster, but because it fails better.

Team Scale Matters Most

Here's what no one tells you about scalability: the hardest part isn't technical, it's organizational.

We've seen technically perfect architectures crumble because the team couldn't work effectively in the codebase. Microservices split across 30 repositories where nobody understands the full system. Clever abstractions that only one person can modify. "Flexible" architectures so complex that simple changes take weeks.

The most scalable codebases have these properties:

New developers can understand core concepts in a week
Common changes follow obvious patterns
The system can be understood in pieces, not just as a whole
Documentation exists and is maintained
Testing is straightforward and fast

Technical scalability without team scalability is just technical debt waiting to compound.

What Actually Matters

After building dozens of systems that need to scale, here's what we focus on:

Start simple. Over-engineering for scale you don't have yet is the fastest way to fail. Build for today's needs, but structure it so tomorrow's needs don't require a rewrite.

Measure everything. You can't optimize what you don't measure. Instrument early, watch the metrics, and let data guide optimization.

Understand your bottlenecks. Usually it's the database. Sometimes it's the network. Rarely is it the code itself. Find the real bottleneck before optimizing.

Plan for complexity. User growth is predictable. Business model changes aren't. Build systems that can evolve as you learn what you're actually building.

Scalability isn't a feature you add later. It's a mindset you build with from the start. Not by over-engineering, but by thinking clearly about what will actually matter as you grow.

Our offices

Follow us