Database Scaling Strategies

01 — The Challenge

When Your Database Becomes the Bottleneck

As our e-commerce platform grew to 2 million active users, the single PostgreSQL instance became the critical bottleneck. Write operations were blocking reads, query times degraded exponentially with data growth, and maintenance windows became impossible without downtime.

The database held 500GB+ of data across 200+ tables. Some tables had 100M+ rows with complex join patterns. Simple queries that once took milliseconds now took seconds during peak hours.

Original Architecture (Single Instance)

📱

App Server 1

📱

App Server 2

📱

App Server N

→

🗄️

Single PostgreSQL

⚠️ Bottleneck

0ms

Avg Query Time

During peak

0GB

Database Size

Growing 20%/month

0M

Largest Table

Rows count

0hrs

Maintenance Window

Required downtime

02 — Solution Architecture

Multi-Pronged Scaling Strategy

We implemented a comprehensive scaling strategy that addressed different access patterns with appropriate solutions: read replicas for read-heavy workloads, table partitioning for time-series data, and connection pooling for efficient resource utilization.

Scaled Architecture

PostgreSQL 15

Cloud SQL

Redis (Cache) PgBouncer pg_partman pganalyze

03 — Implementation

Key Strategies

Strategy 01

Read Replica Configuration

Set up streaming replication with one synchronous replica for high-availability failover and two asynchronous replicas for read scaling. Application code routes read queries to replicas using a connection proxy, reducing primary load by 75%.

Strategy 02

Table Partitioning

Implemented range partitioning on time-series tables (orders, events, logs) using PostgreSQL's native declarative partitioning. Queries now only scan relevant partitions, reducing I/O by 90% for date-filtered queries. Old partitions are automatically detached and archived.

Strategy 03

Index Optimization

Analyzed slow query logs with pganalyze to identify missing indexes. Added partial indexes for frequently filtered conditions, covering indexes to avoid table lookups, and expression indexes for computed values. Reduced average query time by 80%.

Strategy 04

Query Pattern Optimization

Refactored N+1 query patterns using batch loading and JOINs. Replaced expensive COUNT(*) with pre-computed counters. Added materialized views for complex aggregations that refresh on schedule rather than recalculating for each request.

Strategy 05

Connection Pooling

Deployed PgBouncer in transaction pooling mode between application servers and database. This reduced active connections from 1000+ to 100 while supporting the same throughput, eliminating connection exhaustion during traffic spikes.

-- Create partitioned table for orders
CREATE TABLE orders (
    id BIGSERIAL,
    user_id BIGINT NOT NULL,
    created_at TIMESTAMPTZ NOT NULL,
    total_amount DECIMAL(10,2),
    status VARCHAR(20),
    CONSTRAINT orders_pkey PRIMARY KEY (id, created_at)
) PARTITION BY RANGE (created_at);

-- Create quarterly partitions
CREATE TABLE orders_2024_q1 PARTITION OF orders
    FOR VALUES FROM ('2024-01-01') TO ('2024-04-01');
CREATE TABLE orders_2024_q2 PARTITION OF orders
    FOR VALUES FROM ('2024-04-01') TO ('2024-07-01');
    
-- Automated partition creation with pg_partman
SELECT partman.create_parent(
    p_parent_table := 'public.orders',
    p_control := 'created_at',
    p_type := 'native',
    p_interval := '3 months',
    p_premake := 3
);

-- Partial index for active orders only (80% less storage)
CREATE INDEX CONCURRENTLY idx_orders_active 
ON orders (user_id, created_at DESC)
WHERE status NOT IN ('completed', 'cancelled');

-- Covering index to avoid table lookups
CREATE INDEX CONCURRENTLY idx_orders_user_summary
ON orders (user_id, created_at DESC) 
INCLUDE (total_amount, status);

-- Expression index for case-insensitive search
CREATE INDEX CONCURRENTLY idx_users_email_lower
ON users (LOWER(email));

-- BRIN index for time-series data (1/100th the size)
CREATE INDEX CONCURRENTLY idx_events_timestamp
ON events USING BRIN (created_at) WITH (pages_per_range = 32);

04 — Results

Performance Improvements

The database scaling initiative transformed our data layer from a bottleneck into a strength. We can now handle 10x the original query load with better response times and significantly lower infrastructure costs per query.

0ms

Avg Query Time

↓ 99.7% faster

0K

Queries/Second

↑ 10x capacity

0%

Primary Load Reduction

Reads to replicas

0s

Maintenance Downtime

Rolling updates

Aspect	Before	After	Impact
Query P99 Latency	5,000 ms	25 ms	200x faster
Large Table Scans	Full table (100M rows)	Single partition (25M rows)	75% less I/O
Index Storage	45 GB	12 GB (partial indexes)	73% smaller
Active Connections	1,200 (maxed)	100 (pooled)	92% reduction
Failover Time	15 minutes (manual)	30 seconds (automatic)	30x faster

05 — Key Takeaways

Lessons Learned

Partition Early — If your table will grow beyond 10M rows, partition from day one. Retrofitting partitioning on a hot table is painful and risky.
Indexes Aren't Free — Each index slows down writes and consumes storage. Partial indexes and covering indexes maximize benefit while minimizing overhead.
Read Replicas Are Easy Wins — Most applications are read-heavy. Offloading reads to replicas is often the simplest path to scaling.
Connection Pooling is Essential — PostgreSQL's process-per-connection model doesn't scale. PgBouncer is mandatory for any serious deployment.
Monitor Everything — Tools like pganalyze and pg_stat_statements reveal which queries need optimization. Data beats intuition.

Need help scaling your database?

I help teams design and implement database architectures that scale with their business.

Let's Talk