PostgreSQL Query Optimization Guide | EXPLAIN, Indexes, and Slow Query Fixes
이 글의 핵심
PostgreSQL performance problems follow predictable patterns — missing indexes, N+1 queries, bloated tables, and misconfigured connection pools. This guide teaches you to find and fix each one systematically.
Systematic Performance Approach
1. Find slow queries (pg_stat_statements, logs)
2. Understand execution plan (EXPLAIN ANALYZE)
3. Fix: missing index, bad query, table bloat, config
4. Verify (re-run EXPLAIN ANALYZE, monitor)
5. Repeat
1. Finding Slow Queries
-- Enable the extension
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
-- Top 10 slowest queries by total time
SELECT
round(total_exec_time::numeric, 2) AS total_ms,
calls,
round(mean_exec_time::numeric, 2) AS avg_ms,
left(query, 200) AS query
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
-- High average time (optimization candidates)
SELECT
round(mean_exec_time::numeric, 2) AS avg_ms,
calls,
left(query, 200) AS query
FROM pg_stat_statements
WHERE calls > 100
ORDER BY mean_exec_time DESC
LIMIT 20;
# postgresql.conf: log queries over 1 second
log_min_duration_statement = 1000
log_checkpoints = on
track_io_timing = on
2. EXPLAIN ANALYZE
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT u.id, u.email, COUNT(o.id) AS order_count
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.created_at > '2026-01-01'
GROUP BY u.id, u.email
ORDER BY order_count DESC
LIMIT 20;
Red flags in output:
Seq Scanon large tables → needs an indexactual rows>>estimated rows→ stale statistics, run ANALYZESort Method: external merge→ increasework_memBuffers: hit=0, read=N→ I/O bound, increaseshared_buffers
3. Index Strategies
B-tree indexes
-- Single column
CREATE INDEX CONCURRENTLY idx_orders_user_id ON orders(user_id);
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);
-- Composite (most selective column first)
CREATE INDEX CONCURRENTLY idx_users_status_created
ON users(status, created_at);
-- Covering index (index-only scan — no table access)
CREATE INDEX CONCURRENTLY idx_users_status_cover
ON users(status) INCLUDE (id, email);
CONCURRENTLY builds the index without locking the table — always use in production.
Partial indexes
-- Only index active users
CREATE INDEX CONCURRENTLY idx_users_email_active
ON users(email) WHERE is_active = true;
-- Only index pending jobs
CREATE INDEX CONCURRENTLY idx_jobs_pending
ON jobs(created_at) WHERE status = 'pending';
JSONB and full-text
-- GIN index for JSONB containment queries
CREATE INDEX CONCURRENTLY idx_products_meta ON products USING GIN(metadata);
SELECT * FROM products WHERE metadata @> '{"category": "electronics"}';
-- Full-text search
ALTER TABLE articles ADD COLUMN search_vector tsvector
GENERATED ALWAYS AS (to_tsvector('english', title || ' ' || body)) STORED;
CREATE INDEX idx_articles_fts ON articles USING GIN(search_vector);
SELECT * FROM articles WHERE search_vector @@ plainto_tsquery('english', 'postgres index');
4. Find Unused and Missing Indexes
-- Unused indexes (candidates for removal)
SELECT
indexname,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
idx_scan AS scans
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND indexrelname NOT LIKE '%_pkey'
ORDER BY pg_relation_size(indexrelid) DESC;
-- Tables with too many sequential scans (missing index)
SELECT
relname AS table,
seq_scan,
idx_scan,
pg_size_pretty(pg_total_relation_size(relid)) AS size
FROM pg_stat_user_tables
WHERE seq_scan > 100
AND pg_total_relation_size(relid) > 10 * 1024 * 1024
ORDER BY seq_scan DESC
LIMIT 20;
5. Table Bloat and VACUUM
PostgreSQL leaves dead rows after UPDATE/DELETE. VACUUM reclaims them.
-- Check bloat
SELECT
relname AS table_name,
n_dead_tup AS dead_rows,
n_live_tup AS live_rows,
round(n_dead_tup::numeric / NULLIF(n_live_tup + n_dead_tup, 0) * 100, 1) AS dead_pct,
last_autovacuum
FROM pg_stat_user_tables
WHERE n_dead_tup > 10000
ORDER BY n_dead_tup DESC;
-- Manual vacuum (safe, non-locking)
VACUUM ANALYZE orders;
-- VACUUM FULL (rewrites entire table — locks it)
-- Only during maintenance windows
VACUUM FULL orders;
# postgresql.conf: tune autovacuum for high-write tables
autovacuum_vacuum_scale_factor = 0.01
autovacuum_analyze_scale_factor = 0.005
6. Connection Pooling (PgBouncer)
PostgreSQL creates one OS process per connection. At 500+ connections, this overloads the server.
# pgbouncer.ini
[databases]
myapp = host=db.example.com dbname=myapp
[pgbouncer]
pool_mode = transaction # recommended
max_client_conn = 1000 # app-facing connections
default_pool_size = 20 # actual PostgreSQL connections
server_idle_timeout = 600
App (1000 connections) → PgBouncer → PostgreSQL (20 connections)
7. postgresql.conf Key Settings
# Memory
shared_buffers = 4GB # 25% of RAM
effective_cache_size = 12GB # 75% of RAM
work_mem = 64MB # per sort/hash (multiply by max connections)
maintenance_work_mem = 512MB # for VACUUM, CREATE INDEX
# WAL
checkpoint_completion_target = 0.9
max_wal_size = 4GB
# Parallelism
max_parallel_workers_per_gather = 4
# Logging
log_min_duration_statement = 1000
log_lock_waits = on
log_temp_files = 0 # log all temp files (work_mem tuning signal)
8. Common Patterns and Fixes
-- ❌ OFFSET pagination: scans 100,000 rows then discards them
SELECT * FROM posts ORDER BY id DESC LIMIT 20 OFFSET 100000;
-- ✅ Keyset pagination: O(log n) via index
SELECT * FROM posts
WHERE id < :last_seen_id
ORDER BY id DESC LIMIT 20;
-- ❌ Exact COUNT(*) on large table: full scan
SELECT COUNT(*) FROM events;
-- ✅ Approximate count (instant)
SELECT reltuples::bigint FROM pg_class WHERE relname = 'events';
-- ❌ Function on indexed column (can't use index)
SELECT * FROM users WHERE lower(email) = '[email protected]';
-- ✅ Expression index
CREATE INDEX idx_users_email_lower ON users(lower(email));
SELECT * FROM users WHERE lower(email) = '[email protected]';
9. Monitoring Queries
-- Long-running queries
SELECT pid, now() - query_start AS duration, state, left(query, 100) AS query
FROM pg_stat_activity
WHERE state != 'idle' AND query_start < now() - interval '5 seconds'
ORDER BY duration DESC;
-- Kill a query
SELECT pg_terminate_backend(pid);
-- Lock waits
SELECT
blocked.pid AS blocked_pid,
blocking.pid AS blocking_pid,
left(blocked.query, 80) AS blocked_query,
left(blocking.query, 80) AS blocking_query
FROM pg_stat_activity blocked
JOIN pg_stat_activity blocking ON blocking.pid = ANY(pg_blocking_pids(blocked.pid));
-- Buffer cache hit ratio (should be > 99%)
SELECT
round(sum(heap_blks_hit)::numeric /
(sum(heap_blks_hit) + sum(heap_blks_read)) * 100, 2) AS cache_hit_pct
FROM pg_statio_user_tables;
Key Takeaways
| Problem | Fix |
|---|---|
| Slow SELECT | Index on WHERE/JOIN/ORDER BY columns |
| High write latency | Remove unused indexes |
| Too many connections | PgBouncer connection pooling |
| Table bloat | Tune autovacuum or VACUUM ANALYZE |
| Slow pagination | Keyset instead of OFFSET |
| Low cache hit ratio | Increase shared_buffers |
| Sort spills to disk | Increase work_mem |
| Planner makes bad plans | Run ANALYZE, update statistics |
Master EXPLAIN ANALYZE and pg_stat_statements — 90% of PostgreSQL performance problems are solved with the right indexes and query rewrites, before you ever touch configuration.