Debugging Memory Leaks Guide | Profiler, Heap Snapshots, Node.js, Python, Java
이 글의 핵심
Memory leaks cause servers to grow until they crash. This guide gives you a systematic approach — using profilers and heap snapshots — to find and fix leaks in Node.js, Python, and Java.
Memory Leak vs. High Memory Usage
A memory leak is memory that is allocated but never released because references are retained unintentionally. Symptoms:
- Memory usage increases monotonically over time
- GC pauses become longer and more frequent
- Service eventually crashes (OOM / exit code 137 in containers)
High memory usage that plateaus is not a leak — it may be a sizing issue but doesn’t require leak-hunting techniques.
General Approach
1. Confirm the leak — monitor memory over time
2. Isolate the scenario — which request/operation triggers growth?
3. Take heap snapshots before and after
4. Find objects growing between snapshots
5. Trace references back to root cause
6. Fix, deploy, confirm memory stabilizes
Node.js
Monitoring memory
// Log memory usage every 30 seconds
setInterval(() => {
const m = process.memoryUsage();
console.log({
rss: `${Math.round(m.rss / 1024 / 1024)}MB`, // total process memory
heapUsed: `${Math.round(m.heapUsed / 1024 / 1024)}MB`,
heapTotal: `${Math.round(m.heapTotal / 1024 / 1024)}MB`,
external: `${Math.round(m.external / 1024 / 1024)}MB`,
});
}, 30000);
Watch heapUsed — if it climbs continuously it confirms a heap leak. If rss climbs but heapUsed doesn’t, the leak may be in native code or external (Buffers).
Heap snapshots with Chrome DevTools
node --inspect app.js
Open Chrome → chrome://inspect → click “inspect” → Memory tab → Take heap snapshot.
Take a snapshot, trigger the suspected leak (run 100 requests, etc.), take another snapshot. In the second snapshot, select “Comparison” view to see what grew.
Heap snapshots programmatically
import v8 from 'v8';
import fs from 'fs';
// Take a snapshot and write to file
const snapshotStream = v8.writeHeapSnapshot();
console.log('Snapshot written to:', snapshotStream);
// Or use heapdump package for older Node versions
// npm install heapdump
// process.kill(process.pid, 'SIGUSR2') → writes .heapsnapshot
Common Node.js leak patterns
1. Event listeners not removed
// LEAK: adds a listener on every request, never removes it
app.get('/data', (req, res) => {
emitter.on('update', (data) => { // grows unboundedly
res.json(data);
});
});
// FIX: use .once() or remove the listener
app.get('/data', (req, res) => {
emitter.once('update', (data) => {
res.json(data);
});
});
2. Growing cache with no eviction
// LEAK: cache grows forever
const cache = new Map();
app.get('/user/:id', async (req, res) => {
if (!cache.has(req.params.id)) {
cache.set(req.params.id, await fetchUser(req.params.id));
}
res.json(cache.get(req.params.id));
});
// FIX: use LRU cache with size limit
import { LRUCache } from 'lru-cache';
const cache = new LRUCache({ max: 1000, ttl: 1000 * 60 * 5 });
3. Closures holding large objects
// LEAK: data (potentially large) is captured in the closure
function processLargeFile(filePath) {
const data = fs.readFileSync(filePath); // large buffer
return setInterval(() => {
console.log('File size:', data.length); // data never released
}, 1000);
}
// FIX: extract only what you need
function processLargeFile(filePath) {
const size = fs.statSync(filePath).size; // just the number
return setInterval(() => {
console.log('File size:', size);
}, 1000);
}
4. Timers keeping references alive
// LEAK: interval keeps the object alive forever
class DataProcessor {
constructor() {
this.data = new Array(100000).fill('x');
setInterval(() => this.process(), 1000); // keeps `this` alive
}
process() { /* ... */ }
destroy() {
// No way to clean up — interval captures `this`
}
}
// FIX: store interval reference and clear it
class DataProcessor {
constructor() {
this.data = new Array(100000).fill('x');
this.interval = setInterval(() => this.process(), 1000);
}
process() { /* ... */ }
destroy() {
clearInterval(this.interval);
this.data = null;
}
}
Clinic.js (comprehensive profiling)
npm install -g clinic
clinic heap -- node app.js
# Run your load test, then Ctrl+C
# Opens a flamegraph showing allocations
Python
Monitor memory with tracemalloc
import tracemalloc
import linecache
tracemalloc.start()
# ... run suspected leaky code ...
snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics('lineno')
print("Top 10 memory allocations:")
for stat in top_stats[:10]:
print(stat)
memory_profiler — line-by-line
pip install memory_profiler
from memory_profiler import profile
@profile
def my_function():
data = [i for i in range(1000000)]
result = sum(data)
return result
python -m memory_profiler script.py
Output shows memory usage per line — spots exactly where allocations happen.
objgraph — find growing objects
pip install objgraph
import objgraph
# Show most common object types
objgraph.show_most_common_types(limit=20)
# Show what's growing between two points
objgraph.show_growth()
# Trace references to an object
objgraph.show_backrefs(objgraph.by_type('MyClass')[0], max_depth=5)
Common Python leak patterns
1. Mutable default argument
# LEAK: list is created once and reused across all calls
def add_item(item, items=[]):
items.append(item)
return items
# FIX
def add_item(item, items=None):
if items is None:
items = []
items.append(item)
return items
2. Circular references with del
# Circular references are handled by GC, but __del__ prevents collection
class Node:
def __init__(self):
self.other = None
def __del__(self): # prevents GC from collecting the cycle
pass
# FIX: use weakref for back-references
import weakref
class Node:
def __init__(self):
self._other = None
@property
def other(self):
return self._other() if self._other else None
@other.setter
def other(self, value):
self._other = weakref.ref(value)
3. Django queryset caching
# LEAK in background task: qs evaluates and caches all rows
def process_all_users():
users = User.objects.all() # 1M rows loaded into memory
for user in users:
send_email(user)
# FIX: use iterator() to avoid caching
def process_all_users():
for user in User.objects.all().iterator(chunk_size=1000):
send_email(user)
Java / JVM
Heap dump
# Trigger heap dump on OOM automatically
java -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/heapdump.hprof -jar app.jar
# Trigger manually (find PID first)
jmap -dump:live,format=b,file=/tmp/heapdump.hprof <PID>
Analyze with Eclipse Memory Analyzer (MAT) or VisualVM:
- “Leak Suspects Report” in MAT identifies the most likely leak
- Look for objects with many retained instances
jstat — GC monitoring
# Monitor GC every 1 second
jstat -gcutil <PID> 1000
# Output columns: S0% S1% E% O% M% YGC YGCT FGC FGCT GCT
# O% = Old Gen usage — if this grows continuously, there's a leak
Common Java leak patterns
1. Static collections
// LEAK: static map grows forever
public class Cache {
private static final Map<String, Object> store = new HashMap<>();
public static void put(String key, Object value) {
store.put(key, value); // never evicted
}
}
// FIX: use Caffeine or Guava cache with eviction
import com.github.benmanes.caffeine.cache.Cache;
import com.github.benmanes.caffeine.cache.Caffeine;
Cache<String, Object> cache = Caffeine.newBuilder()
.maximumSize(10_000)
.expireAfterWrite(Duration.ofMinutes(5))
.build();
2. Unclosed resources
// LEAK: connection never closed if exception is thrown
public void query() throws SQLException {
Connection conn = dataSource.getConnection();
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery("SELECT ...");
// Exception here → conn never closed → connection leak
}
// FIX: try-with-resources
public void query() throws SQLException {
try (Connection conn = dataSource.getConnection();
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery("SELECT ...")) {
// auto-closed even on exception
}
}
3. ThreadLocal not cleaned up
// LEAK in thread pools: ThreadLocal value survives thread reuse
private static final ThreadLocal<LargeObject> holder = new ThreadLocal<>();
// FIX: always remove in finally
LargeObject obj = new LargeObject();
holder.set(obj);
try {
processRequest();
} finally {
holder.remove(); // critical in thread pool environments
}
Container / Kubernetes Context
# Watch memory usage of pods
kubectl top pods --sort-by=memory
# Check OOMKilled history
kubectl get events --field-selector reason=OOMKilling
# If container is OOMKilled, increase limit and check for leak
kubectl describe pod <pod-name>
# Look for: OOMKilled, Last State exit code 137
Set a memory limit in Kubernetes — this forces a crash (which you’ll notice) instead of unbounded growth:
resources:
requests:
memory: "256Mi"
limits:
memory: "512Mi"
Profiling Tools Summary
| Language | Tool | Use for |
|---|---|---|
| Node.js | Chrome DevTools / --inspect | Heap snapshots, allocation timeline |
| Node.js | Clinic.js | Production-safe heap profiling |
| Node.js | process.memoryUsage() | Continuous monitoring |
| Python | tracemalloc | Allocation by file/line |
| Python | memory_profiler | Line-by-line memory usage |
| Python | objgraph | Object count growth |
| Java | jmap + MAT | Heap dump analysis |
| Java | jstat -gcutil | GC health monitoring |
| All | Prometheus + Grafana | Long-term memory trend |
Key Takeaways
- Confirm first — monitor memory over time before hunting
- Snapshot comparison — take heap snapshots before/after suspected scenario
- Common culprits: event listeners, unbounded caches, closures, static collections, unclosed resources
- Use LRU/TTL caches — every in-memory cache needs an eviction policy
- try-with-resources /
finallycleanup — don’t trust GC for external resources - Container limits — set memory limits in Kubernetes to catch leaks fast
Memory leaks are almost always about forgotten references. Once you can see which objects are accumulating in a heap snapshot, tracing back to where they’re being held is usually straightforward.