More bad query is worse than less bad query
Stop hiding performance problems and start solving them
The Wrong Way to Fix Performance Problems
Moving It to the Background Doesn't Fix It
When faced with slow operations, there's a natural tendency to reach for the same solution: "Let's move it to the background."
This doesn't solve the problem. It just moves it.
The Background Job Fallacy
Here's what typically happens:
- Feature works great in development (10 records)
- Gets slow in staging (1,000 records)
- Becomes unusable in production (100,000 records)
- Solution: "Let's add a background job!"
Now you have:
- The same slow query running somewhere else
- Added complexity of job queues
- Delayed user feedback
- More infrastructure to maintain
- The same performance problem, just hidden
- Users wondering why their data isn't ready yet
- Support tickets about "missing" data that's just processing
The Cascade of Complications
Once you move something to the background, the problems multiply:
Now You Need Progress Indicators
Users can't see results immediately, so you add progress bars, spinners, "check back later" messages. More code, more complexity.
Now You Need Notifications
When the job finishes, you need to tell users. Email? Push notifications? In-app alerts? More infrastructure.
Now You Need Retry Logic
Background jobs fail. Networks hiccup. Databases go down. So you add retries, exponential backoff, dead letter queues. More complexity.
Now You Need Monitoring
Is the queue backed up? Are jobs failing? How long is the average wait? You need dashboards, alerts, on-call rotations.
The Real Problem
The queries you're running are one of the only things that matter for performance. Everything else is just moving deck chairs on the Titanic.
When something is slow, the answer isn't to hide it. The answer is to understand why it's slow and fix it.
Common Cop-Outs
"We'll paginate it"
Great, now it's slow 50 times instead of once. And users have to click through pages to find what they need.
"We'll cache it"
Caching seems like a silver bullet until:
- The cache expires during peak traffic
- You need to invalidate it (cache invalidation is one of the two hard problems in computer science)
- Different users need different data (per-user caching gets expensive fast)
- The first user after expiry hits the slow path and times out
- You realize you're now maintaining two systems: the database and the cache
"We'll pre-compute it"
Now you're running the slow query all the time instead of on-demand. And dealing with stale data. And managing another background job.
"We'll use a faster language"
Your N+1 query problem is still an N+1 query problem in Rust. Bad algorithms are bad in any language.
"We'll throw hardware at it"
Scaling vertically has limits. That query taking 30 seconds on 8 cores will take 15 seconds on 16 cores. Still too slow.
"We'll shard/partition the data"
Now you have distributed systems problems on top of your query problems. Plus the complexity of routing requests to the right shard.
The Right Approach
- Profile First: Use EXPLAIN ANALYZE. Understand what's actually happening.
- Fix the Query: Add the right indexes. Remove unnecessary joins. Fetch only what you need.
- Measure Again: Confirm it's actually faster.
Only after exhausting query optimization should you consider architectural changes.
A Quick Example
Instead of moving this to a background job:
-- Takes 30 seconds
SELECT * FROM orders o
JOIN users u ON o.user_id = u.id
JOIN products p ON o.product_id = p.id
WHERE o.created_at > '2024-01-01';
Fix the actual problem:
-- Takes 0.3 seconds with proper indexes
CREATE INDEX idx_orders_created_at ON orders(created_at);
CREATE INDEX idx_orders_user_product ON orders(user_id, product_id);
SELECT o.id, u.name, p.name, o.total
FROM orders o
JOIN users u ON o.user_id = u.id
JOIN products p ON o.product_id = p.id
WHERE o.created_at > '2024-01-01';
The Bottom Line
Performance problems don't disappear when you move them to the background. They just become someone else's problem - usually yours at 3 AM when the job queue backs up.
Fix the query. It's almost always the query.