The Slow Death
Our vehicle search was perfect in January. Users typed "Toyota Camry" and got results instantly.
By June, the same search took 3 seconds. By September, users were complaining daily.
We hadn't changed any code. So what happened?
What Is Full-Text Search?
Before we dive in, let's understand what "full-text search" actually means.
Normal database search:
"Find me the car where make = 'Toyota'"
Database: "Easy! I'll check the 'make' column for exact matches."Full-text search:
"Find me cars about 'fuel efficient family sedan'"
Database: "Uh... I need to search descriptions, features,
match similar words, rank by relevance..."Full-text search is like the difference between:
- Looking up a word in a dictionary (exact match - fast)
- Researching a topic in a library (finding relevant books - slow)
Why PostgreSQL Was Great (At First)
PostgreSQL has built-in full-text search. And it's genuinely good.
Think of it like a Swiss Army knife:
SWISS ARMY KNIFE (PostgreSQL)
┌─────────────────────────────────┐
│ │
│ ✓ Store data │
│ ✓ Query data │
│ ✓ Full-text search │
│ ✓ JSON storage │
│ ✓ Geographic queries │
│ │
│ "I can do EVERYTHING!" │
│ │
└─────────────────────────────────┘
DEDICATED SEARCH ENGINE (Elasticsearch)
┌─────────────────────────────────┐
│ │
│ ✓ Search │
│ ✓ Search really fast │
│ ✓ Search at massive scale │
│ │
│ "I do ONE thing perfectly." │
│ │
└─────────────────────────────────┘For small to medium datasets, the Swiss Army knife is perfect. Why add another tool?
The Data That Killed Us
Here's what we didn't expect:
JANUARY (500,000 vehicles)
Search time: 50ms ✓
Users: Happy
APRIL (1,000,000 vehicles)
Search time: 200ms ✓
Users: Still happy
SEPTEMBER (2,000,000 vehicles)
Search time: 2-3 SECONDS ✗
Users: "Is your site broken??"What happened?
It wasn't linear growth. It was a cliff.
Think of it like a highway:
TRAFFIC (Data) vs SPEED (Search Time)
Light traffic: 🚗 🚗 🚗 → 70 mph (fast!)
Medium traffic: 🚗🚗🚗🚗🚗🚗 → 50 mph (okay)
Heavy traffic: 🚗🚗🚗🚗🚗🚗🚗🚗🚗🚗 → 30 mph (slower)
Rush hour: 🚗🚗🚗🚗🚗🚗🚗🚗🚗🚗🚗🚗🚗🚗🚗 → STOPPED
You don't gradually slow down.
You hit a wall where everything stops.What Was Actually Happening
Let me show you the search query users were running:
"Find Toyota Camry, under $30,000, year 2018 or newer, in California"Sounds simple. Here's what PostgreSQL had to do:
STEP 1: Full-text search
"Find all vehicles matching 'Toyota Camry'"
Result: 150,000 matches
STEP 2: Apply filters
"Which ones are under $30,000?"
"Which ones are 2018 or newer?"
"Which ones are in California?"
Result: 3,000 remaining
STEP 3: Calculate relevance
"How relevant is each one?"
→ Must calculate a score for ALL 150,000 matches
→ Even though we only need 20 results
STEP 4: Sort by relevance
"Put the best ones first"
→ Must sort ALL 150,000 matches
→ Even though we only show 20
STEP 5: Return results
"Here's your first 20 results"See the problem?
We needed 20 results. PostgreSQL processed 150,000.
The Memory Problem
Here's where it gets worse.
When PostgreSQL sorts results, it uses memory. The setting is called work_mem:
Default work_mem = 4MB
WHAT HAPPENS:
150,000 matches × data per match = ~50MB needed
PostgreSQL: "I only have 4MB of RAM for this..."
PostgreSQL: "I'll use the DISK instead."
Using disk is 100x slower than RAM!I saw this in the query plan:
Sort Method: external merge Disk: 18432kB
^^^^^^^^^^^^^^^^
THIS IS THE PROBLEM!Analogy:
Imagine you're organizing 150,000 index cards:
- In RAM: Cards on a huge table, shuffle instantly
- On Disk: Cards in filing cabinets, open/close drawers constantly
Disk sorting killed our performance.
Fixes I Tried (And Why They Failed)
Attempt 1: More Memory
SET work_mem = '256MB'; -- Up from 4MBWhy it failed:
One search = 256MB
10 concurrent searches = 2.5GB
100 concurrent searches = 25GB 💥
We ran out of server memory.Attempt 2: Smarter Indexes
-- Only index recent vehicles
CREATE INDEX ON vehicles USING GIN(search_vector)
WHERE created_at > NOW() - INTERVAL '1 year';Why it failed:
Only helped when users searched recent vehicles. Most didn't.
Attempt 3: Pre-computed Views
CREATE MATERIALIZED VIEW vehicle_search AS
SELECT * FROM vehicles WITH relevance pre-calculated;Why it failed:
- Took 20 minutes to refresh
- Still slow at 2M records
- Data was always slightly stale
When I Knew It Was Time to Migrate
I asked myself three questions:
Question 1: "Is search a primary feature?"
For us: YES
- Users come specifically to search
- Search is the homepage
- Bad search = users leaveQuestion 2: "Do we have millions of records?"
For us: YES
- 2M and growing
- Adding 50K per monthQuestion 3: "Do we need complex queries?"
For us: YES
- Text + price + year + location
- Fuzzy matching (typo tolerance)
- Synonyms (SUV = Sport Utility Vehicle)If you answered YES to all three, PostgreSQL full-text search probably isn't enough.
The Migration to Elasticsearch
I moved search to Elasticsearch. Think of it like this:
BEFORE: One building does everything
┌────────────────────────────────────┐
│ PostgreSQL │
│ │
│ Storage ✓ Search ✗ (slow) │
│ │
└────────────────────────────────────┘
AFTER: Specialized buildings
┌──────────────────┐ ┌──────────────────┐
│ PostgreSQL │ │ Elasticsearch │
│ │ │ │
│ Storage ✓ │───►│ Search ✓ │
│ (truth) │ │ (speed) │
└──────────────────┘ └──────────────────┘
PostgreSQL stores the data.
Elasticsearch searches it.
Best of both worlds.The Sync Process
When data changes in PostgreSQL, we update Elasticsearch:
# When a vehicle is saved
@receiver(post_save, sender=Vehicle)
def sync_to_search(sender, instance, **kwargs):
# Tell Elasticsearch about the change
update_search_index(instance)The Results
Same 2M records, completely different experience:
PostgreSQL Elasticsearch
Simple search: 800ms-2s 15-30ms
With filters: 1-3s 20-50ms
With location: 3-5s 30-60ms
10 users at once: SLOW FAST
100 users at once: BROKEN STILL FAST60x faster. And it scales.
Should You Migrate?
Stick with PostgreSQL if:
- You have less than 500K records
- Search is a secondary feature
- Simple queries (no complex filters)
- You can't manage another service
Consider Elasticsearch if:
- Search is a primary feature
- Millions of records
- Complex queries (text + filters + location)
- Need fuzzy matching and synonyms
- Many concurrent users
The Decision Tree
"How important is search to your app?"
│
▼
┌──────────────┐
│ Primary │────► Consider Elasticsearch
│ feature? │
└──────────────┘
│ No
▼
┌──────────────┐
│ < 500K rows? │────► PostgreSQL is fine
└──────────────┘
│ No
▼
┌──────────────┐
│ Complex │────► Consider Elasticsearch
│ queries? │
└──────────────┘
│ No
▼
PostgreSQL is probably fineKey Lessons
Lesson 1: PostgreSQL Isn't a Search Engine
It's a database that has search features. Big difference.
Like how a smartphone has a camera. Good enough for most. Not good enough for a photographer.
Lesson 2: Watch for the Cliff
Performance doesn't degrade linearly. You'll be fine until suddenly you're not.
Early warning signs:
work_memspilling to disk- Queries returning 100K+ rows before filtering
- Search times doubling every few months
Lesson 3: Plan the Migration Before You're on Fire
It's much easier to add Elasticsearch when things are calm. Don't wait until users are complaining.
Quick Reference
If your PostgreSQL search is slow:
-- Check if you're spilling to disk
EXPLAIN ANALYZE your_slow_query;
-- Look for:
-- "Sort Method: external merge Disk" (BAD)
-- "Rows Removed by Filter: 100000+" (BAD)Quick fix (temporary):
-- Give more memory to sorting
SET work_mem = '128MB';
-- But watch total memory usage!When to migrate:
✓ Search is primary feature
✓ Millions of records
✓ Complex filters
✓ Many concurrent users
= Time for ElasticsearchSummary
THE PROBLEM:
Search got slower as data grew
(50ms → 200ms → 3 seconds)
WHY IT HAPPENED:
PostgreSQL processes ALL matches before filtering
At 2M records, "all matches" = 150,000 rows
THE FIX:
Migrated search to Elasticsearch
PostgreSQL still stores data (truth)
Elasticsearch handles search (speed)
THE RESULT:
60x faster searches
Scales to any number of usersPostgreSQL's full-text search isn't broken. It's just not designed for search-heavy applications at scale. Know when you've outgrown it.
Related Reading
- Batch to Event-Driven - How we rebuilt our data pipeline to support real-time search
- Linear Algebra for Web Developers - Understanding vectors and embeddings for semantic search
- Queue Sizing - Managing the queue when search indexing falls behind
