본문으로 건너뛰기
Previous
Next
NumPy Basics

NumPy Basics

NumPy Basics

이 글의 핵심

NumPy tutorial: arrays, vectorization, broadcasting, indexing, statistics, and linear algebra. Learn fast numerical Python with ndarray, BLAS-style ops, and practical pitfalls.

Introduction

“The foundation of numerical Python”

NumPy is Python’s core library for high-performance numerical computing.

1. NumPy basics

Installation

pip install numpy

Creating arrays

import numpy as np
# From a list
arr = np.array([1, 2, 3, 4, 5])
print(arr)  # [1 2 3 4 5]
# 2D array
arr2d = np.array([[1, 2, 3], [4, 5, 6]])
print(arr2d)
# [[1 2 3]
#  [4 5 6]]
# Special arrays
zeros = np.zeros((3, 4))  # filled with 0
ones = np.ones((2, 3))    # filled with 1
empty = np.empty((2, 2))  # uninitialized
arange = np.arange(0, 10, 2)  # [0, 2, 4, 6, 8]
linspace = np.linspace(0, 1, 5)  # [0, 0.25, 0.5, 0.75, 1]

2. Array operations

Vectorized operations

arr = np.array([1, 2, 3, 4, 5])
# Scalar ops
print(arr + 10)  # [11 12 13 14 15]
print(arr * 2)   # [2 4 6 8 10]
print(arr ** 2)  # [1 4 9 16 25]
# Array vs array
arr2 = np.array([10, 20, 30, 40, 50])
print(arr + arr2)  # [11 22 33 44 55]
print(arr * arr2)  # [10 40 90 160 250]

Broadcasting

# 2D + scalar
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr + 10)
# [[11 12 13]
#  [14 15 16]]
# Matrix + row vector
matrix = np.array([[1, 2, 3], [4, 5, 6]])
vector = np.array([10, 20, 30])
print(matrix + vector)
# [[11 22 33]
#  [14 25 36]]

3. Indexing

Basic indexing

arr = np.array([1, 2, 3, 4, 5])
print(arr[0])   # 1
print(arr[-1])  # 5
print(arr[1:4]) # [2 3 4]
# 2D
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(arr2d[0, 0])  # 1
print(arr2d[1, :])  # [4 5 6] (full row 1)
print(arr2d[:, 1])  # [2 5 8] (full column 1)

Boolean indexing

arr = np.array([1, 2, 3, 4, 5])
# Conditional filter
mask = arr > 3
print(mask)  # [False False False True True]
print(arr[mask])  # [4 5]
# One line
print(arr[arr > 3])  # [4 5]

4. Reshaping arrays

reshape

arr = np.arange(12)
print(arr)  # [0 1 2 3 4 5 6 7 8 9 10 11]
# 3×4 matrix
matrix = arr.reshape(3, 4)
print(matrix)
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]
# Flatten
flat = matrix.flatten()
print(flat)  # [0 1 2 3 4 5 6 7 8 9 10 11]

5. Statistical functions

Basic statistics

arr = np.array([1, 2, 3, 4, 5])
print(np.sum(arr))   # 15
print(np.mean(arr))  # 3.0
print(np.std(arr))   # ~1.414 (standard deviation)
print(np.min(arr))   # 1
print(np.max(arr))   # 5
# Axis reductions
arr2d = np.array([[1, 2, 3], [4, 5, 6]])
print(np.sum(arr2d, axis=0))  # [5 7 9] (column sums)
print(np.sum(arr2d, axis=1))  # [6 15] (row sums)

6. Linear algebra

Matrix operations

# Matrix multiply
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
print(np.dot(A, B))
# [[19 22]
#  [43 50]]
print(A @ B)  # same (Python 3.5+)
# Transpose
print(A.T)
# [[1 3]
#  [2 4]]
# Inverse
inv_A = np.linalg.inv(A)
print(inv_A)
# Eigenvalues
eigenvalues, eigenvectors = np.linalg.eig(A)
print(eigenvalues)

7. Practical example

Image-style array processing

import numpy as np
# Synthetic image tensor (H, W, C)
image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
# Grayscale (mean across channels)
gray = np.mean(image, axis=2).astype(np.uint8)
# Brightness
bright = np.clip(image + 50, 0, 255).astype(np.uint8)
# Shape
print(f"크기: {image.shape}")  # (100, 100, 3)

Practical tips

NumPy performance

# ✅ Prefer vectorization
arr = np.arange(1000000)
result = arr ** 2  # fast
# ❌ Python loops
result = [x ** 2 for x in arr]  # slow
# ✅ dtype matters
arr = np.array([1, 2, 3], dtype=np.int32)
# ✅ Copy vs view
arr_copy = arr.copy()  # independent copy
arr_view = arr[:]      # view (shares memory)

Going deeper

Batch L2 normalization (runnable)

Stack vectors as rows, then L2-normalize each row in a vectorized way—a common deep-learning preprocessing pattern.

import numpy as np
rng = np.random.default_rng(0)
X = rng.normal(size=(5, 3))
norms = np.linalg.norm(X, axis=1, keepdims=True)
Xn = X / np.clip(norms, 1e-12, None)
print(np.linalg.norm(Xn, axis=1))
# All norms ≈ 1

Common mistakes

  • Misusing axis and summing/averaging along the wrong dimension.
  • Confusing views and copies, mutating data unintentionally.
  • Integer overflow when accumulating with a narrow integer dtype.

Caveats

  • Floating-point addition is not associative at scale; consider float64 or math.fsum-style patterns for large sums.

In production

  • Reuse buffers with out= when memory is tight.
  • When mixing with Pandas, watch values vs index alignment.

Alternatives

LibraryRole
NumPyArray math, BLAS/LAPACK
NumbaJIT for hot loops
JAX / PyTorchAutodiff, GPU

Further reading


Summary

Key takeaways

  1. NumPy: fast numerical computing
  2. ndarray: N-dimensional array
  3. Vectorization: avoid Python loops
  4. Broadcasting: align shapes automatically
  5. Linear algebra: matrix multiply, eigenvalues

Next steps

  • [Matplotlib visualization](/en/blog/python-series-18-matplotlib/
  • [Data preprocessing](/en/blog/python-series-19-data-preprocessing/

  • [Pandas basics | Python data analysis](/en/blog/python-series-16-pandas/

자주 묻는 질문 (FAQ)

Q. 이 내용을 실무에서 언제 쓰나요?

A. NumPy tutorial: arrays, vectorization, broadcasting, indexing, statistics, and linear algebra. Learn fast numerical Pyth… 실무에서는 위 본문의 예제와 선택 가이드를 참고해 적용하면 됩니다.

Q. 선행으로 읽으면 좋은 글은?

A. 각 글 하단의 이전 글 또는 관련 글 링크를 따라가면 순서대로 배울 수 있습니다. Python 시리즈 목차에서 전체 흐름을 확인할 수 있습니다.

Q. 더 깊이 공부하려면?

A. cppreference와 해당 라이브러리 공식 문서를 참고하세요. 글 말미의 참고 자료 링크도 활용하면 좋습니다.


같이 보면 좋은 글 (내부 링크)

이 주제와 연결되는 다른 글입니다.

  • [Pandas Basics | Complete Guide to Python Data Analysis](/en/blog/python-series-16-pandas/
  • [Matplotlib Basics](/en/blog/python-series-18-matplotlib/
  • [Arrays and Lists](/en/blog/algorithm-series-01-array-list/

이 글에서 다루는 키워드 (관련 검색어)

Python, NumPy, Data Science, Arrays, Numerical Computing, Linear Algebra 등으로 검색하시면 이 글이 도움이 됩니다.