NumPy Basics | Complete Guide to Numerical Computing in Python

NumPy Basics | Complete Guide to Numerical Computing in Python

이 글의 핵심

A practical guide to NumPy: Python’s foundation for fast numerical computing—arrays, broadcasting, stats, and linear algebra with runnable examples.

Introduction

“The foundation of numerical Python”

NumPy is Python’s core library for high-performance numerical computing.


1. NumPy basics

Installation

pip install numpy

Creating arrays

import numpy as np

# From a list
arr = np.array([1, 2, 3, 4, 5])
print(arr)  # [1 2 3 4 5]

# 2D array
arr2d = np.array([[1, 2, 3], [4, 5, 6]])
print(arr2d)
# [[1 2 3]
#  [4 5 6]]

# Special arrays
zeros = np.zeros((3, 4))  # filled with 0
ones = np.ones((2, 3))    # filled with 1
empty = np.empty((2, 2))  # uninitialized
arange = np.arange(0, 10, 2)  # [0, 2, 4, 6, 8]
linspace = np.linspace(0, 1, 5)  # [0, 0.25, 0.5, 0.75, 1]

2. Array operations

Vectorized operations

arr = np.array([1, 2, 3, 4, 5])

# Scalar ops
print(arr + 10)  # [11 12 13 14 15]
print(arr * 2)   # [2 4 6 8 10]
print(arr ** 2)  # [1 4 9 16 25]

# Array vs array
arr2 = np.array([10, 20, 30, 40, 50])
print(arr + arr2)  # [11 22 33 44 55]
print(arr * arr2)  # [10 40 90 160 250]

Broadcasting

# 2D + scalar
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr + 10)
# [[11 12 13]
#  [14 15 16]]

# Matrix + row vector
matrix = np.array([[1, 2, 3], [4, 5, 6]])
vector = np.array([10, 20, 30])
print(matrix + vector)
# [[11 22 33]
#  [14 25 36]]

3. Indexing

Basic indexing

arr = np.array([1, 2, 3, 4, 5])

print(arr[0])   # 1
print(arr[-1])  # 5
print(arr[1:4]) # [2 3 4]

# 2D
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(arr2d[0, 0])  # 1
print(arr2d[1, :])  # [4 5 6] (full row 1)
print(arr2d[:, 1])  # [2 5 8] (full column 1)

Boolean indexing

arr = np.array([1, 2, 3, 4, 5])

# Conditional filter
mask = arr > 3
print(mask)  # [False False False True True]
print(arr[mask])  # [4 5]

# One line
print(arr[arr > 3])  # [4 5]

4. Reshaping arrays

reshape

arr = np.arange(12)
print(arr)  # [0 1 2 3 4 5 6 7 8 9 10 11]

# 3×4 matrix
matrix = arr.reshape(3, 4)
print(matrix)
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]

# Flatten
flat = matrix.flatten()
print(flat)  # [0 1 2 3 4 5 6 7 8 9 10 11]

5. Statistical functions

Basic statistics

arr = np.array([1, 2, 3, 4, 5])

print(np.sum(arr))   # 15
print(np.mean(arr))  # 3.0
print(np.std(arr))   # ~1.414 (standard deviation)
print(np.min(arr))   # 1
print(np.max(arr))   # 5

# Axis reductions
arr2d = np.array([[1, 2, 3], [4, 5, 6]])
print(np.sum(arr2d, axis=0))  # [5 7 9] (column sums)
print(np.sum(arr2d, axis=1))  # [6 15] (row sums)

6. Linear algebra

Matrix operations

# Matrix multiply
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

print(np.dot(A, B))
# [[19 22]
#  [43 50]]

print(A @ B)  # same (Python 3.5+)

# Transpose
print(A.T)
# [[1 3]
#  [2 4]]

# Inverse
inv_A = np.linalg.inv(A)
print(inv_A)

# Eigenvalues
eigenvalues, eigenvectors = np.linalg.eig(A)
print(eigenvalues)

7. Practical example

Image-style array processing

import numpy as np

# Synthetic image tensor (H, W, C)
image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)

# Grayscale (mean across channels)
gray = np.mean(image, axis=2).astype(np.uint8)

# Brightness
bright = np.clip(image + 50, 0, 255).astype(np.uint8)

# Shape
print(f"크기: {image.shape}")  # (100, 100, 3)

Practical tips

NumPy performance

# ✅ Prefer vectorization
arr = np.arange(1000000)
result = arr ** 2  # fast

# ❌ Python loops
result = [x ** 2 for x in arr]  # slow

# ✅ dtype matters
arr = np.array([1, 2, 3], dtype=np.int32)

# ✅ Copy vs view
arr_copy = arr.copy()  # independent copy
arr_view = arr[:]      # view (shares memory)

Going deeper

Batch L2 normalization (runnable)

Stack vectors as rows, then L2-normalize each row in a vectorized way—a common deep-learning preprocessing pattern.

import numpy as np

rng = np.random.default_rng(0)
X = rng.normal(size=(5, 3))
norms = np.linalg.norm(X, axis=1, keepdims=True)
Xn = X / np.clip(norms, 1e-12, None)
print(np.linalg.norm(Xn, axis=1))
# All norms ≈ 1

Common mistakes

  • Misusing axis and summing/averaging along the wrong dimension.
  • Confusing views and copies, mutating data unintentionally.
  • Integer overflow when accumulating with a narrow integer dtype.

Caveats

  • Floating-point addition is not associative at scale; consider float64 or math.fsum-style patterns for large sums.

In production

  • Reuse buffers with out= when memory is tight.
  • When mixing with Pandas, watch values vs index alignment.

Alternatives

LibraryRole
NumPyArray math, BLAS/LAPACK
NumbaJIT for hot loops
JAX / PyTorchAutodiff, GPU

Further reading


Summary

Key takeaways

  1. NumPy: fast numerical computing
  2. ndarray: N-dimensional array
  3. Vectorization: avoid Python loops
  4. Broadcasting: align shapes automatically
  5. Linear algebra: matrix multiply, eigenvalues

Next steps