Matplotlib Basics | Complete Guide to Data Visualization in Python
이 글의 핵심
A hands-on Matplotlib guide: line, bar, histogram, and scatter plots, multi-panel figures, export settings, and workflow tips for clean charts.
Introduction
“Turn data into pictures”
Matplotlib is Python’s standard library for data visualization.
1. Matplotlib basics
Installation
pip install matplotlib
First plot
import matplotlib.pyplot as plt
# Data
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]
# Plot
plt.plot(x, y)
plt.xlabel('X')
plt.ylabel('Y')
plt.title('Line plot')
plt.show()
2. Line plots
Basic line plot
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 10, 100)
y1 = np.sin(x)
y2 = np.cos(x)
plt.plot(x, y1, label='sin(x)', color='blue', linestyle='-')
plt.plot(x, y2, label='cos(x)', color='red', linestyle='--')
plt.xlabel('x')
plt.ylabel('y')
plt.title('Trigonometric functions')
plt.legend()
plt.grid(True)
plt.show()
3. Bar charts
Vertical bars
categories = ['A', 'B', 'C', 'D']
values = [25, 40, 30, 55]
plt.bar(categories, values, color='skyblue')
plt.xlabel('Category')
plt.ylabel('Value')
plt.title('Bar chart')
plt.show()
Horizontal bars
plt.barh(categories, values, color='lightgreen')
plt.xlabel('Value')
plt.ylabel('Category')
plt.title('Horizontal bar chart')
plt.show()
4. Histogram
# Normal-ish sample
data = np.random.randn(1000)
plt.hist(data, bins=30, color='purple', alpha=0.7, edgecolor='black')
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.title('Histogram')
plt.show()
5. Scatter plot
x = np.random.rand(50)
y = np.random.rand(50)
colors = np.random.rand(50)
sizes = 1000 * np.random.rand(50)
plt.scatter(x, y, c=colors, s=sizes, alpha=0.5, cmap='viridis')
plt.colorbar()
plt.xlabel('X')
plt.ylabel('Y')
plt.title('Scatter plot')
plt.show()
6. Multiple subplots
fig, axes = plt.subplots(2, 2, figsize=(10, 8))
# Top-left
axes[0, 0].plot([1, 2, 3], [1, 4, 9])
axes[0, 0].set_title('Line')
# Top-right
axes[0, 1].bar(['A', 'B', 'C'], [3, 7, 5])
axes[0, 1].set_title('Bar')
# Bottom-left
axes[1, 0].hist(np.random.randn(100), bins=20)
axes[1, 0].set_title('Histogram')
# Bottom-right
axes[1, 1].scatter(np.random.rand(50), np.random.rand(50))
axes[1, 1].set_title('Scatter')
plt.tight_layout()
plt.show()
7. Practical example
Sales visualization
import matplotlib.pyplot as plt
import pandas as pd
# Data
sales_data = pd.DataFrame({
'month': ['1월', '2월', '3월', '4월', '5월', '6월'],
'sales': [150, 180, 165, 220, 250, 240],
'profit': [30, 45, 35, 60, 75, 70]
})
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
# Revenue trend
ax1.plot(sales_data['month'], sales_data['sales'],
marker='o', linewidth=2, markersize=8)
ax1.set_title('월별 매출 추이', fontsize=14, fontweight='bold')
ax1.set_xlabel('월')
ax1.set_ylabel('매출 (만원)')
ax1.grid(True, alpha=0.3)
# Profit bars
ax2.bar(sales_data['month'], sales_data['profit'],
color='green', alpha=0.7)
ax2.set_title('월별 수익', fontsize=14, fontweight='bold')
ax2.set_xlabel('월')
ax2.set_ylabel('수익 (만원)')
plt.tight_layout()
plt.savefig('sales_report.png', dpi=300)
plt.show()
Practical tips
Styling
# Built-in style
plt.style.use('seaborn-v0_8')
# Font for CJK labels (OS-specific; example: Windows)
plt.rcParams['font.family'] = 'Malgun Gothic'
plt.rcParams['axes.unicode_minus'] = False
# Figure size
plt.figure(figsize=(10, 6))
# Color palette
colors = ['#FF6B6B', '#4ECDC4', '#45B7D1']
Going deeper
Scatter with regression line and residual histogram (runnable)
Synthetic data with numpy, linear fit via polyfit, and a residual histogram—includes savefig for reports.
import numpy as np
import matplotlib.pyplot as plt
rng = np.random.default_rng(42)
x = np.linspace(0, 10, 80)
y = 2.5 * x + 1.0 + rng.normal(0, 1.8, size=x.shape)
coef = np.polyfit(x, y, 1)
y_hat = np.poly1d(coef)(x)
residuals = y - y_hat
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(11, 4), constrained_layout=True)
ax1.scatter(x, y, alpha=0.7, label="Observed")
ax1.plot(x, y_hat, color="crimson", linewidth=2, label="Linear fit")
ax1.set_title("Scatter with regression line")
ax1.legend()
ax1.grid(True, alpha=0.3)
ax2.hist(residuals, bins=18, color="steelblue", edgecolor="black", alpha=0.85)
ax2.set_title("잔차 분포")
ax2.grid(True, alpha=0.3)
fig.savefig("regression_residuals.png", dpi=200)
plt.show()
Common mistakes
- Calling
savefigaftershow()in some backends yields empty files—order matters. - Missing fonts for non-Latin labels—configure per OS.
- Mixing OO API and pyplot state so artists land on the wrong axes.
Caveats
- Journals often prefer vector formats (PDF/SVG); for raster, set dpi explicitly.
- Consider colorblind-friendly palettes (
cividis, etc.).
In production
- Share styles via
matplotlibrcorplt.style.context. - For batch reports, call
plt.close(fig)to free memory.
Alternatives
| Tool | Best for |
|---|---|
| Matplotlib | Fine control, papers, non-interactive backends |
| Seaborn | Quick statistical plots |
| Plotly | Interactive web charts |
Further reading
Summary
Key takeaways
- Matplotlib: core Python plotting library
- pyplot: convenient stateful API
- Chart types: line, bar, histogram, scatter
- Subplots: grid of axes
- Style: colors, fonts, layout