본문으로 건너뛰기
Previous
Next
PostgreSQL Query Optimization Guide | EXPLAIN· Indexes

PostgreSQL Query Optimization Guide | EXPLAIN· Indexes

PostgreSQL Query Optimization Guide | EXPLAIN· Indexes

이 글의 핵심

PostgreSQL performance problems follow predictable patterns — missing indexes, N+1 queries, bloated tables, and misconfigured connection pools. This guide teaches you to find and fix each one systematically.

Systematic Performance Approach

1. Find slow queries (pg_stat_statements, logs)
2. Understand execution plan (EXPLAIN ANALYZE)
3. Fix: missing index, bad query, table bloat, config
4. Verify (re-run EXPLAIN ANALYZE, monitor)
5. Repeat

1. Finding Slow Queries

-- Enable the extension
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Top 10 slowest queries by total time
SELECT
  round(total_exec_time::numeric, 2) AS total_ms,
  calls,
  round(mean_exec_time::numeric, 2) AS avg_ms,
  left(query, 200) AS query
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

-- High average time (optimization candidates)
SELECT
  round(mean_exec_time::numeric, 2) AS avg_ms,
  calls,
  left(query, 200) AS query
FROM pg_stat_statements
WHERE calls > 100
ORDER BY mean_exec_time DESC
LIMIT 20;
# postgresql.conf: log queries over 1 second
log_min_duration_statement = 1000
log_checkpoints = on
track_io_timing = on

2. EXPLAIN ANALYZE

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT u.id, u.email, COUNT(o.id) AS order_count
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.created_at > '2026-01-01'
GROUP BY u.id, u.email
ORDER BY order_count DESC
LIMIT 20;

Red flags in output:

  • Seq Scan on large tables → needs an index
  • actual rows >> estimated rows → stale statistics, run ANALYZE
  • Sort Method: external merge → increase work_mem
  • Buffers: hit=0, read=N → I/O bound, increase shared_buffers

3. Index Strategies

B-tree indexes

-- Single column
CREATE INDEX CONCURRENTLY idx_orders_user_id ON orders(user_id);
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);

-- Composite (most selective column first)
CREATE INDEX CONCURRENTLY idx_users_status_created
  ON users(status, created_at);

-- Covering index (index-only scan — no table access)
CREATE INDEX CONCURRENTLY idx_users_status_cover
  ON users(status) INCLUDE (id, email);

CONCURRENTLY builds the index without locking the table — always use in production.

Partial indexes

-- Only index active users
CREATE INDEX CONCURRENTLY idx_users_email_active
  ON users(email) WHERE is_active = true;

-- Only index pending jobs
CREATE INDEX CONCURRENTLY idx_jobs_pending
  ON jobs(created_at) WHERE status = 'pending';

JSONB and full-text

-- GIN index for JSONB containment queries
CREATE INDEX CONCURRENTLY idx_products_meta ON products USING GIN(metadata);
SELECT * FROM products WHERE metadata @> '{"category": "electronics"}';

-- Full-text search
ALTER TABLE articles ADD COLUMN search_vector tsvector
  GENERATED ALWAYS AS (to_tsvector('english', title || ' ' || body)) STORED;
CREATE INDEX idx_articles_fts ON articles USING GIN(search_vector);
SELECT * FROM articles WHERE search_vector @@ plainto_tsquery('english', 'postgres index');

4. Find Unused and Missing Indexes

-- Unused indexes (candidates for removal)
SELECT
  indexname,
  pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
  idx_scan AS scans
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND indexrelname NOT LIKE '%_pkey'
ORDER BY pg_relation_size(indexrelid) DESC;

-- Tables with too many sequential scans (missing index)
SELECT
  relname AS table,
  seq_scan,
  idx_scan,
  pg_size_pretty(pg_total_relation_size(relid)) AS size
FROM pg_stat_user_tables
WHERE seq_scan > 100
  AND pg_total_relation_size(relid) > 10 * 1024 * 1024
ORDER BY seq_scan DESC
LIMIT 20;

5. Table Bloat and VACUUM

PostgreSQL leaves dead rows after UPDATE/DELETE. VACUUM reclaims them.

-- Check bloat
SELECT
  relname AS table_name,
  n_dead_tup AS dead_rows,
  n_live_tup AS live_rows,
  round(n_dead_tup::numeric / NULLIF(n_live_tup + n_dead_tup, 0) * 100, 1) AS dead_pct,
  last_autovacuum
FROM pg_stat_user_tables
WHERE n_dead_tup > 10000
ORDER BY n_dead_tup DESC;

-- Manual vacuum (safe, non-locking)
VACUUM ANALYZE orders;

-- VACUUM FULL (rewrites entire table — locks it)
-- Only during maintenance windows
VACUUM FULL orders;
# postgresql.conf: tune autovacuum for high-write tables
autovacuum_vacuum_scale_factor = 0.01
autovacuum_analyze_scale_factor = 0.005

6. Connection Pooling (PgBouncer)

PostgreSQL creates one OS process per connection. At 500+ connections, this overloads the server.

# pgbouncer.ini
[databases]
myapp = host=db.example.com dbname=myapp

[pgbouncer]
pool_mode = transaction     # recommended
max_client_conn = 1000      # app-facing connections
default_pool_size = 20      # actual PostgreSQL connections
server_idle_timeout = 600
App (1000 connections) → PgBouncer → PostgreSQL (20 connections)

7. postgresql.conf Key Settings

# Memory
shared_buffers = 4GB           # 25% of RAM
effective_cache_size = 12GB    # 75% of RAM
work_mem = 64MB                # per sort/hash (multiply by max connections)
maintenance_work_mem = 512MB   # for VACUUM, CREATE INDEX

# WAL
checkpoint_completion_target = 0.9
max_wal_size = 4GB

# Parallelism
max_parallel_workers_per_gather = 4

# Logging
log_min_duration_statement = 1000
log_lock_waits = on
log_temp_files = 0             # log all temp files (work_mem tuning signal)

8. Common Patterns and Fixes

-- ❌ OFFSET pagination: scans 100,000 rows then discards them
SELECT * FROM posts ORDER BY id DESC LIMIT 20 OFFSET 100000;

-- ✅ Keyset pagination: O(log n) via index
SELECT * FROM posts
WHERE id < :last_seen_id
ORDER BY id DESC LIMIT 20;

-- ❌ Exact COUNT(*) on large table: full scan
SELECT COUNT(*) FROM events;

-- ✅ Approximate count (instant)
SELECT reltuples::bigint FROM pg_class WHERE relname = 'events';

-- ❌ Function on indexed column (can't use index)
SELECT * FROM users WHERE lower(email) = '[email protected]';

-- ✅ Expression index
CREATE INDEX idx_users_email_lower ON users(lower(email));
SELECT * FROM users WHERE lower(email) = '[email protected]';

9. Monitoring Queries

-- Long-running queries
SELECT pid, now() - query_start AS duration, state, left(query, 100) AS query
FROM pg_stat_activity
WHERE state != 'idle' AND query_start < now() - interval '5 seconds'
ORDER BY duration DESC;

-- Kill a query
SELECT pg_terminate_backend(pid);

-- Lock waits
SELECT
  blocked.pid AS blocked_pid,
  blocking.pid AS blocking_pid,
  left(blocked.query, 80) AS blocked_query,
  left(blocking.query, 80) AS blocking_query
FROM pg_stat_activity blocked
JOIN pg_stat_activity blocking ON blocking.pid = ANY(pg_blocking_pids(blocked.pid));

-- Buffer cache hit ratio (should be > 99%)
SELECT
  round(sum(heap_blks_hit)::numeric /
    (sum(heap_blks_hit) + sum(heap_blks_read)) * 100, 2) AS cache_hit_pct
FROM pg_statio_user_tables;

Key Takeaways

ProblemFix
Slow SELECTIndex on WHERE/JOIN/ORDER BY columns
High write latencyRemove unused indexes
Too many connectionsPgBouncer connection pooling
Table bloatTune autovacuum or VACUUM ANALYZE
Slow paginationKeyset instead of OFFSET
Low cache hit ratioIncrease shared_buffers
Sort spills to diskIncrease work_mem
Planner makes bad plansRun ANALYZE, update statistics

Master EXPLAIN ANALYZE and pg_stat_statements — 90% of PostgreSQL performance problems are solved with the right indexes and query rewrites, before you ever touch configuration.


자주 묻는 질문 (FAQ)

Q. 이 내용을 실무에서 언제 쓰나요?

A. Optimize PostgreSQL for production. Covers EXPLAIN ANALYZE, index strategies, connection pooling, VACUUM, configuration … 실무에서는 위 본문의 예제와 선택 가이드를 참고해 적용하면 됩니다.

Q. 선행으로 읽으면 좋은 글은?

A. 각 글 하단의 이전 글 또는 관련 글 링크를 따라가면 순서대로 배울 수 있습니다. C++ 시리즈 목차에서 전체 흐름을 확인할 수 있습니다.

Q. 더 깊이 공부하려면?

A. cppreference와 해당 라이브러리 공식 문서를 참고하세요. 글 말미의 참고 자료 링크도 활용하면 좋습니다.


같이 보면 좋은 글 (내부 링크)

이 주제와 연결되는 다른 글입니다.

  • [PostgreSQL vs MySQL | Deep Comparison | When to Use Each](/en/blog/database-comparison-postgresql-mysql/
  • [PostgreSQL vs MySQL | Deep Comparison | When to Use Each](/en/blog/database-comparison-postgresql-mysql/

이 글에서 다루는 키워드 (관련 검색어)

PostgreSQL, Database, Performance, SQL, Backend, DevOps 등으로 검색하시면 이 글이 도움이 됩니다.