Optimization Quick Reference

First PublishedFeb 9, 2026Last UpdatedFeb 28, 2026ByAtif Alam

This page helps you choose which optimization to add when you hit a specific pain. Find your pain, get the answer.

How to use this page: Match your pain in the quick chooser below; click a fix to jump to that category’s details (Signals, Goal / Benefit, Tools, Risks).

Quick Chooser (One-Line Rule)

Match your pain to the fix in the table below.

Pain	Fix
Repeated reads are slow	Caching
Can’t explain failures quickly	Observability
DB joins too slow	Replication
Service goes down = everything fails	Asynchronous Processing
Spiky writes / slow jobs on request path	Asynchronous Processing
Many consumers + replay needed	Streaming / Event Log
Can’t trust downstream service	Rate Control
Queries too slow or need full-text / fuzzy search	Search & Indexing
Single DB/shard too hot or write ceiling	Data Partitioning
Ops toil / manual recovery	Coordination & Orchestration
Noisy neighbors or mixed priority traffic	Traffic Shaping
Analytics queries hurting prod DB	Storage Specialization

Details By Category

Caching

Signals: DB/API latency high; read QPS heavy; repeated same queries; spike traffic
Goal / Benefit: Reduce latency + backend load
Tools: Redis/Memcached, CDN, local caches, TTL
Risks: Stale data, invalidation complexity, hot keys, cache stampede

Observability

Signals: Hard to debug; unknown bottlenecks; frequent incidents; blind deploys
Goal / Benefit: Reduce MTTR + safe change
Tools: Observability (metrics/logs/traces, dashboards, alerts, SLOs)
Risks: Noise/alert fatigue, cost, missing instrumentation

Replication

Signals: Need higher availability; read scaling; DR requirements
Goal / Benefit: Resilience + read scale
Tools: Read replicas, denormalization, multi-AZ DB, quorum systems
Risks: Replication lag, failover complexity, split-brain risk

Asynchronous Processing

Signals: Slow tasks in request path; timeouts; retries explode; dependencies flaky
Goal / Benefit: Decouple + move work off request path
Tools: Queues (SQS/RabbitMQ), decoupling, workers, retries, DLQ
Risks: Eventual consistency, duplication, ordering, failure handling

Streaming / Event Log

Signals: Many consumers; replay or backfill needed; audit trail; fan-out
Goal / Benefit: Fan-out, replay, durability
Tools: Stream / event log (Kafka, Pulsar, Kinesis, Redpanda)
Risks: Ordering guarantees, retention vs cost, operational complexity

When to use stream vs queue: Redis vs Kafka: when to use which.

Rate Control

Signals: Traffic spikes; abuse; downstream overload; SLOs degrade under load
Goal / Benefit: Protect system + fairness
Tools: Rate limits, quotas, circuit breakers, retries, bulkheads
Risks: False positives, client friction, tuning thresholds

Search & Indexing

Signals: DB queries too slow/complex; full-text needed; filtering+ranking
Goal / Benefit: Fast query experience
Tools: Search index (Elasticsearch/OpenSearch), secondary indexes
Risks: Dual writes, index lag, reindexing cost, relevance tuning

Data Partitioning

Signals: Single DB/table shard too hot; write throughput ceiling; large datasets
Goal / Benefit: Horizontal scaling (throughput)
Tools: Sharding by key, partitions, consistent hashing
Risks: Cross-shard queries, rebalancing, skew/hot partitions

Coordination & Orchestration

Signals: Many services; manual deployments; failovers/scale require humans
Goal / Benefit: Automate lifecycle + coordination
Tools: Orchestration (Kubernetes/Nomad), service discovery, leader election
Risks: Operational complexity, misconfig outages, learning curve

Traffic Shaping

Signals: Multiple request classes; noisy neighbors; need graceful degradation
Goal / Benefit: Prioritize critical traffic
Tools: Traffic shaping (priority queues, load shedding, admission control). Common tech: Envoy, NGINX, Kong, Redis (rate limits / priority queues), Kubernetes ResourceQuota, AWS API Gateway / Cloudflare (edge throttling).
Risks: Starving lower tiers, policy complexity

Storage Specialization

Signals: One DB can’t meet mixed needs (latency vs analytics vs blobs)
Goal / Benefit: Use the right store per workload
Tools: OLTP DB, separate OLAP store/warehouse, object store, time-series DB
Risks: Data duplication, consistency, ETL complexity

For a stage-by-stage view (MVP → Growth → Advanced) and links to staged examples, see Staged Design Examples.

Next: Redis vs Kafka: when to use which — a worked example of choosing between two common building blocks.