NumPy Basics | Complete Guide to Numerical Computing in Python
이 글의 핵심
A practical guide to NumPy: Python’s foundation for fast numerical computing—arrays, broadcasting, stats, and linear algebra with runnable examples.
Introduction
“The foundation of numerical Python”
NumPy is Python’s core library for high-performance numerical computing.
1. NumPy basics
Installation
pip install numpy
Creating arrays
import numpy as np
# From a list
arr = np.array([1, 2, 3, 4, 5])
print(arr) # [1 2 3 4 5]
# 2D array
arr2d = np.array([[1, 2, 3], [4, 5, 6]])
print(arr2d)
# [[1 2 3]
# [4 5 6]]
# Special arrays
zeros = np.zeros((3, 4)) # filled with 0
ones = np.ones((2, 3)) # filled with 1
empty = np.empty((2, 2)) # uninitialized
arange = np.arange(0, 10, 2) # [0, 2, 4, 6, 8]
linspace = np.linspace(0, 1, 5) # [0, 0.25, 0.5, 0.75, 1]
2. Array operations
Vectorized operations
arr = np.array([1, 2, 3, 4, 5])
# Scalar ops
print(arr + 10) # [11 12 13 14 15]
print(arr * 2) # [2 4 6 8 10]
print(arr ** 2) # [1 4 9 16 25]
# Array vs array
arr2 = np.array([10, 20, 30, 40, 50])
print(arr + arr2) # [11 22 33 44 55]
print(arr * arr2) # [10 40 90 160 250]
Broadcasting
# 2D + scalar
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr + 10)
# [[11 12 13]
# [14 15 16]]
# Matrix + row vector
matrix = np.array([[1, 2, 3], [4, 5, 6]])
vector = np.array([10, 20, 30])
print(matrix + vector)
# [[11 22 33]
# [14 25 36]]
3. Indexing
Basic indexing
arr = np.array([1, 2, 3, 4, 5])
print(arr[0]) # 1
print(arr[-1]) # 5
print(arr[1:4]) # [2 3 4]
# 2D
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(arr2d[0, 0]) # 1
print(arr2d[1, :]) # [4 5 6] (full row 1)
print(arr2d[:, 1]) # [2 5 8] (full column 1)
Boolean indexing
arr = np.array([1, 2, 3, 4, 5])
# Conditional filter
mask = arr > 3
print(mask) # [False False False True True]
print(arr[mask]) # [4 5]
# One line
print(arr[arr > 3]) # [4 5]
4. Reshaping arrays
reshape
arr = np.arange(12)
print(arr) # [0 1 2 3 4 5 6 7 8 9 10 11]
# 3×4 matrix
matrix = arr.reshape(3, 4)
print(matrix)
# [[ 0 1 2 3]
# [ 4 5 6 7]
# [ 8 9 10 11]]
# Flatten
flat = matrix.flatten()
print(flat) # [0 1 2 3 4 5 6 7 8 9 10 11]
5. Statistical functions
Basic statistics
arr = np.array([1, 2, 3, 4, 5])
print(np.sum(arr)) # 15
print(np.mean(arr)) # 3.0
print(np.std(arr)) # ~1.414 (standard deviation)
print(np.min(arr)) # 1
print(np.max(arr)) # 5
# Axis reductions
arr2d = np.array([[1, 2, 3], [4, 5, 6]])
print(np.sum(arr2d, axis=0)) # [5 7 9] (column sums)
print(np.sum(arr2d, axis=1)) # [6 15] (row sums)
6. Linear algebra
Matrix operations
# Matrix multiply
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
print(np.dot(A, B))
# [[19 22]
# [43 50]]
print(A @ B) # same (Python 3.5+)
# Transpose
print(A.T)
# [[1 3]
# [2 4]]
# Inverse
inv_A = np.linalg.inv(A)
print(inv_A)
# Eigenvalues
eigenvalues, eigenvectors = np.linalg.eig(A)
print(eigenvalues)
7. Practical example
Image-style array processing
import numpy as np
# Synthetic image tensor (H, W, C)
image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
# Grayscale (mean across channels)
gray = np.mean(image, axis=2).astype(np.uint8)
# Brightness
bright = np.clip(image + 50, 0, 255).astype(np.uint8)
# Shape
print(f"크기: {image.shape}") # (100, 100, 3)
Practical tips
NumPy performance
# ✅ Prefer vectorization
arr = np.arange(1000000)
result = arr ** 2 # fast
# ❌ Python loops
result = [x ** 2 for x in arr] # slow
# ✅ dtype matters
arr = np.array([1, 2, 3], dtype=np.int32)
# ✅ Copy vs view
arr_copy = arr.copy() # independent copy
arr_view = arr[:] # view (shares memory)
Going deeper
Batch L2 normalization (runnable)
Stack vectors as rows, then L2-normalize each row in a vectorized way—a common deep-learning preprocessing pattern.
import numpy as np
rng = np.random.default_rng(0)
X = rng.normal(size=(5, 3))
norms = np.linalg.norm(X, axis=1, keepdims=True)
Xn = X / np.clip(norms, 1e-12, None)
print(np.linalg.norm(Xn, axis=1))
# All norms ≈ 1
Common mistakes
- Misusing
axisand summing/averaging along the wrong dimension. - Confusing views and copies, mutating data unintentionally.
- Integer overflow when accumulating with a narrow integer dtype.
Caveats
- Floating-point addition is not associative at scale; consider
float64ormath.fsum-style patterns for large sums.
In production
- Reuse buffers with
out=when memory is tight. - When mixing with Pandas, watch
valuesvs index alignment.
Alternatives
| Library | Role |
|---|---|
| NumPy | Array math, BLAS/LAPACK |
| Numba | JIT for hot loops |
| JAX / PyTorch | Autodiff, GPU |
Further reading
Summary
Key takeaways
- NumPy: fast numerical computing
- ndarray: N-dimensional array
- Vectorization: avoid Python loops
- Broadcasting: align shapes automatically
- Linear algebra: matrix multiply, eigenvalues