Why use comprehensions?

They are concise, often faster than manual append loops, and idiomatic in Python.

List comprehension vs for loop?

Use comprehensions for simple transforms; use explicit loops for complex logic.

When use a generator expression?

When you stream large data and only need one pass—much lower memory use.

Python Comprehensions | List, Dict, Set, and Generator Expressions

2026년 3월 28일 · 14분 읽기 · 수정 2026년 3월 28일 Intermediate Tutorial

이 글의 핵심

Practical guide to Python comprehensions: concise loops for lists, dicts, and sets, plus generator expressions for memory-efficient iteration.

Introduction

“Build a list in one line”

Comprehensions are a concise, fast feature of Pythonic code.

1. List comprehensions

What they are

A list comprehension builds a list in a single expression. For many cases it is clearer and faster than a manual for loop with append.

Syntax: [expr for var in iterable if condition]

Basics

# Traditional loop
squares = []
for i in range(10):
    squares.append(i ** 2)

print(squares)

# List comprehension
squares = [i ** 2 for i in range(10)]
print(squares)

Rough timing comparison (order-of-magnitude; depends on Python version and hardware):

import time

start = time.time()
result1 = []
for i in range(1000000):
    result1.append(i ** 2)
print(f"for loop: {time.time() - start:.4f}s")

start = time.time()
result2 = [i ** 2 for i in range(1000000)]
print(f"comprehension: {time.time() - start:.4f}s")

Filtering with `if`

evens = [i for i in range(10) if i % 2 == 0]
print(evens)

multiples = [i for i in range(30) if i % 3 == 0 and i > 10]
print(multiples)

words = ['apple', 'banana', 'cherry', 'date', 'elderberry']
long_words = [word for word in words if len(word) > 5]
print(long_words)

Conditional expressions (`if` / `else`)

labels = ['even' if i % 2 == 0 else 'odd' for i in range(5)]
print(labels)

# if/else sits before the final `for`:
# [expr_if_true if cond else expr_if_false for x in iterable]

numbers = [-2, -1, 0, 1, 2]
signs = [
    'positive' if n > 0 else ('negative' if n < 0 else 'zero')
    for n in numbers
]
print(signs)

scores = [95, 85, 75, 65, 55]
grades = [
    'A' if s >= 90 else 'B' if s >= 80 else 'C' if s >= 70 else 'D' if s >= 60 else 'F'
    for s in scores
]
print(grades)

Filter (if only) vs map (if/else)

evens = [i for i in range(10) if i % 2 == 0]

labels = ['even' if i % 2 == 0 else 'odd' for i in range(10)]

positive_squares = [i ** 2 if i > 0 else 0 for i in range(-5, 6) if i != 0]
print(positive_squares)

Nested loops

matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

flat = []
for row in matrix:
    for num in row:
        flat.append(num)
print(flat)

flat = [num for row in matrix for num in row]
print(flat)

# Left `for` is outer, right `for` is inner — same order as nested fors

multiplication_table = [
    f"{i} x {j} = {i*j}"
    for i in range(2, 10)
    for j in range(1, 10)
]
print(multiplication_table[:5])

coordinates = [(x, y) for x in range(3) for y in range(3)]
print(coordinates)

diagonal = [(x, y) for x in range(5) for y in range(5) if x == y]
print(diagonal)

Nested comprehension vs nested lists

matrix = [[i * j for j in range(5)] for i in range(5)]
print(matrix)

matrix = []
for i in range(5):
    row = []
    for j in range(5):
        row.append(i * j)
    matrix.append(row)

2. Dictionary comprehensions

Syntax

{key_expr: value_expr for var in iterable if condition}

squares_dict = {}
for i in range(5):
    squares_dict[i] = i ** 2

squares_dict = {i: i ** 2 for i in range(5)}
print(squares_dict)

names = ['Alice', 'Bob', 'Charlie']
name_dict = {i: name for i, name in enumerate(names)}
print(name_dict)

Filtering and transforms

even_squares = {i: i ** 2 for i in range(10) if i % 2 == 0}
print(even_squares)

scores = {'Alice': 85, 'Bob': 92, 'Carol': 78, 'Dana': 95}
high_scores = {name: score for name, score in scores.items() if score >= 90}
print(high_scores)

data = {'apple': 5, 'banana': 3, 'cherry': 8, 'date': 2}
filtered = {
    k.upper(): v * 2
    for k, v in data.items()
    if len(k) > 4 and v > 3
}
print(filtered)

Swapping keys and values

original = {'a': 1, 'b': 2, 'c': 3}
swapped = {v: k for k, v in original.items()}
print(swapped)

original = {'a': 1, 'b': 2, 'c': 1}
swapped = {v: k for k, v in original.items()}
print(swapped)

from collections import defaultdict
swapped_multi = defaultdict(list)
for k, v in original.items():
    swapped_multi[v].append(k)
print(dict(swapped_multi))

Practical snippets

words = ['apple', 'banana', 'cherry', 'date']
word_lengths = {word: len(word) for word in words}
print(word_lengths)

env_str = "DEBUG=True,PORT=8000,HOST=localhost"
env_dict = {
    pair.split('=')[0]: pair.split('=')[1]
    for pair in env_str.split(',')
}
print(env_dict)

keys = ['name', 'age', 'city']
values = ['Alice', 25, 'Seoul']
person = {k: v for k, v in zip(keys, values)}
print(person)

products = {'apple': 1000, 'banana': 500, 'cherry': 2000, 'date': 800}
discounted = {
    name: price * 0.9
    for name, price in products.items()
    if price >= 1000
}
print(discounted)

3. Set comprehensions

Syntax

{expr for var in iterable if condition} — unordered, unique elements.

numbers = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]

unique = set(numbers)
unique = {n for n in numbers}
print(unique)

numbers = [1, -2, 2, -3, 3, -4, 4]
abs_unique = {abs(n) for n in numbers}
print(abs_unique)

Conditional sets

even_set = {i for i in range(10) if i % 2 == 0}
print(even_set)

text = "Hello World"
vowels = {char.lower() for char in text if char.lower() in 'aeiou'}
print(vowels)

words = ['hi', 'hello', 'hey', 'hello', 'world', 'hi']
long_words = {word for word in words if len(word) >= 3}
print(long_words)

Examples

emails = [
    '[email protected]',
    '[email protected]',
    '[email protected]',
    '[email protected]',
    '[email protected]'
]
domains = {email.split('@')[1] for email in emails}
print(domains)

files = ['image.jpg', 'doc.pdf', 'photo.jpg', 'video.mp4', 'report.pdf']
extensions = {file.split('.')[-1] for file in files}
print(extensions)

numbers = [123, 456, 789, 111, 222, 333]
last_digits = {n % 10 for n in numbers}
print(last_digits)

Set operations

list1 = [1, 2, 3, 4, 5]
list2 = [4, 5, 6, 7, 8]

common = {x for x in list1} & {x for x in list2}
print(common)
print(set(list1) & set(list2))

diff = {x for x in list1} - {x for x in list2}
print(diff)

union = {x for x in list1} | {x for x in list2}
print(union)

4. Generator expressions

Syntax

(expr for var in iterable if condition) — lazy, one value at a time.

squares_list = [i ** 2 for i in range(1000000)]
print(type(squares_list))

squares_gen = (i ** 2 for i in range(1000000))
print(type(squares_gen))

print(next(squares_gen))
print(next(squares_gen))

for square in (i ** 2 for i in range(5)):
    print(square, end=' ')
print()

gen = (i for i in range(3))
print(list(gen))
print(list(gen))

Memory footprint (illustrative)

import sys

list_comp = [i for i in range(100000)]
print(sys.getsizeof(list_comp))

gen_expr = (i for i in range(100000))
print(sys.getsizeof(gen_expr))

When generators shine

total = sum(i ** 2 for i in range(1000000))
maximum = max(i ** 2 for i in range(1000))

has_large = any(i ** 2 > 10000 for i in range(1000000))

with open('large_file.txt') as f:
    non_empty_lines = sum(1 for line in f if line.strip())

numbers = range(1000000)
evens = (x for x in numbers if x % 2 == 0)
squares = (x ** 2 for x in evens)
large = (x for x in squares if x > 100)
result = sum(large)

Generator vs list

# Generator: one pass, low memory
total = sum(i ** 2 for i in range(1000000))

# List: multiple passes, indexing, len()
squares = [i ** 2 for i in range(10)]
print(squares[5])
print(len(squares))
print(sum(squares))
print(max(squares))

5. Practical examples

Example 1: CSV-like string to dict rows

csv_data = "name,age,city\nAlice,25,Seoul\nBob,30,Busan\nCarol,28,Daejeon"

lines = csv_data.strip().split('\n')
header = lines[0].split(',')

data = [
    dict(zip(header, line.split(',')))
    for line in lines[1:]
]
print(data)

Typed conversion

data_typed = [
    {
        'name': parts[0],
        'age': int(parts[1]),
        'city': parts[2]
    }
    for line in lines[1:]
    for parts in [line.split(',')]
]
print(data_typed)

Example 2: student records

students = [
    {'name': 'Alice', 'score': 85},
    {'name': 'Bob', 'score': 92},
    {'name': 'Carol', 'score': 78},
    {'name': 'Dana', 'score': 95},
    {'name': 'Eve', 'score': 88}
]

high_scores = [s['name'] for s in students if s['score'] >= 90]
print(high_scores)

graded = [
    {**s, 'grade': 'A' if s['score'] >= 90 else 'B' if s['score'] >= 80 else 'C'}
    for s in students
]
print(graded)

passed = [
    {**s, 'grade': 'A' if s['score'] >= 90 else 'B'}
    for s in students
    if s['score'] >= 80
]
print(passed)

Example 3: cleaning strings

names = ['  alice  ', 'BOB', '  Charlie', 'david  ']

cleaned = [name.strip().lower() for name in names]
print(cleaned)

capitalized = [name.strip().capitalize() for name in names]
print(capitalized)

filtered = [name.strip() for name in names if len(name.strip()) >= 3]
print(filtered)

Example 4: file paths

import os

files = ['data.txt', 'image.png', 'report.txt', 'video.mp4', 'notes.txt']

txt_files = [f for f in files if f.endswith('.txt')]
print(txt_files)

names_only = [os.path.splitext(f)[0] for f in txt_files]
print(names_only)

base_path = '/home/user/documents'
full_paths = [os.path.join(base_path, f) for f in txt_files]
print(full_paths)

Example 5: JSON API payload

api_response = {
    'users': [
        {'id': 1, 'name': 'Alice', 'active': True, 'age': 25},
        {'id': 2, 'name': 'Bob', 'active': False, 'age': 30},
        {'id': 3, 'name': 'Charlie', 'active': True, 'age': 35},
        {'id': 4, 'name': 'David', 'active': True, 'age': 28}
    ]
}

active_ids = [
    user['id']
    for user in api_response['users']
    if user['active']
]
print(active_ids)

active_users = [
    {'name': user['name'], 'age': user['age']}
    for user in api_response['users']
    if user['active']
]
print(active_users)

senior_active = [
    user['name']
    for user in api_response['users']
    if user['active'] and user['age'] >= 30
]
print(senior_active)

6. Performance notes

Memory: list vs generator

import sys

squares_list = [i ** 2 for i in range(1000000)]
squares_gen = (i ** 2 for i in range(1000000))

print(sys.getsizeof(squares_list))
print(sys.getsizeof(squares_gen))

Micro-benchmark (illustrative)

import time

data = list(range(1000000))

start = time.time()
result1 = [x * 2 for x in data if x % 2 == 0]
print(f"comprehension: {time.time() - start:.4f}s")

start = time.time()
result2 = []
for x in data:
    if x % 2 == 0:
        result2.append(x * 2)
print(f"for loop: {time.time() - start:.4f}s")

start = time.time()
result3 = list(map(lambda x: x * 2, filter(lambda x: x % 2 == 0, data)))
print(f"map+filter: {time.time() - start:.4f}s")

Deeply nested comprehensions

def is_ascending(x, y, z):
    return x < y < z

result = [
    z
    for x in range(10)
    for y in range(10)
    for z in range(10)
    if is_ascending(x, y, z)
]

from itertools import combinations
result = [c[2] for c in combinations(range(10), 3)]

7. Style and best practices

Readability first

squares = [x ** 2 for x in range(10)]

result = []
for x in range(10):
    if x % 2 == 0:
        temp = x ** 2
        if temp > 20:
            result.append(temp)
        else:
            result.append(temp * 2)

# Avoid overly dense one-liners that hide intent

Common pitfalls

1) Accidental shared rows in a matrix

matrix = [[0] * 3] * 3
matrix[0][0] = 1
print(matrix)

matrix = [[0] * 3 for _ in range(3)]
matrix[0][0] = 1
print(matrix)

2) Building a list just to sum

total = sum([i ** 2 for i in range(1000000)])
total = sum(i ** 2 for i in range(1000000))

3) Side effects inside comprehensions

results = []
[results.append(x * 2) for x in range(10)]

results = []
for x in range(10):
    results.append(x * 2)

results = [x * 2 for x in range(10)]

Debug strategy

data = [1, 2, 3, 4, 5]

filtered = [x for x in data if x % 2 == 0]
print(filtered)

result = [x ** 2 for x in filtered]
print(result)

Patterns

raw_names = ['  ALICE  ', 'bob', '  Charlie  ', 'DAVID']
normalized = [name.strip().title() for name in raw_names]
print(normalized)

numbers = range(1, 11)
even_sum = sum(x for x in numbers if x % 2 == 0)
odd_sum = sum(x for x in numbers if x % 2 == 1)
print(even_sum, odd_sum)

8. Troubleshooting

“list index out of range”

data = [[1, 2], [3, 4, 5], [6]]

result = [row[2] for row in data if len(row) > 2]
print(result)

result = [row[2] if len(row) > 2 else None for row in data]
print(result)

Duplicate keys in dict comprehensions

items = [('a', 1), ('b', 2), ('a', 3)]
d = {k: v for k, v in items}
print(d)

from collections import defaultdict
d = defaultdict(list)
[d[k].append(v) for k, v in items]
print(dict(d))

Exceptions inside comprehensions

data = ['1', '2', 'three', '4', 'five']

result = [int(x) for x in data if x.isdigit()]

def safe_int(x):
    try:
        return int(x)
    except ValueError:
        return None

result = [safe_int(x) for x in data]
result_filtered = [x for x in result if x is not None]
print(result_filtered)

9. Quick reference table

Situation	Prefer	Why
Simple map/filter	Comprehension	Short and fast
Complex branching	`for` loop	Clarity
Side effects (I/O, DB)	`for` loop	Obvious intent
One-pass huge data	Generator expression	Memory
Need index, len, many passes	List comprehension	Reusable list

10. Exercises

Exercise 1

Squares of multiples of 3 from 1 through 20:

# Expected: [9, 36, 81, 144, 225, 324]

Exercise 2

Turn people = [('Alice', 25), ('Bob', 30), ('Charlie', 35)] into {name: age}.

Exercise 3

From a 2-D list, flatten even numbers only.

matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
# Expected: [2, 4, 6, 8]

Exercise 4

Classify temperatures as "cold" (<10), "mild" (10–25), or "hot" (>25).

temps = [5, 15, 30, 8, 22, 28]

Answers

multiples_of_3 = [x ** 2 for x in range(1, 21) if x % 3 == 0]

people = [('Alice', 25), ('Bob', 30), ('Charlie', 35)]
people_dict = {name: age for name, age in people}

matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
evens_flat = [num for row in matrix for num in row if num % 2 == 0]

temps = [5, 15, 30, 8, 22, 28]
labels = [
    'cold' if t < 10 else 'mild' if t <= 25 else 'hot'
    for t in temps
]
print(labels)

Summary

Key takeaways

List comprehension: [expr for x in iterable if cond] — concise list construction.
Dict comprehension: {k: v for ...} — build mappings in one expression.
Set comprehension: {expr for ...} — unique values with optional transforms.
Generator expression: (expr for ...) — lazy iteration, tiny memory footprint.
Readability wins: reach for a plain loop when the comprehension becomes cryptic.

After you master comprehensions

Code tends to be shorter and idiomatic
Data prep tasks feel lighter
You can choose list vs generator deliberately

Next steps

Decorators
Generator functions with yield
Functions | lambdas and higher-order functions

Python environment setup | Install Python on Windows and Mac

이 글의 핵심

Introduction

“Build a list in one line”

1. List comprehensions

What they are

Basics

Filtering with if

Conditional expressions (if / else)

Nested loops

2. Dictionary comprehensions

Syntax

Filtering and transforms

Swapping keys and values

Practical snippets

3. Set comprehensions

Syntax

Conditional sets

Examples

Set operations

4. Generator expressions

Syntax

Memory footprint (illustrative)

When generators shine

Generator vs list

5. Practical examples

Example 1: CSV-like string to dict rows

Example 2: student records

Example 3: cleaning strings

Example 4: file paths

Example 5: JSON API payload

6. Performance notes

Memory: list vs generator

Micro-benchmark (illustrative)

Deeply nested comprehensions

7. Style and best practices

Readability first

Common pitfalls

Debug strategy

Patterns

8. Troubleshooting

“list index out of range”

Duplicate keys in dict comprehensions

Exceptions inside comprehensions

9. Quick reference table

10. Exercises

Exercise 1

Exercise 2

Exercise 3

Exercise 4

Summary

Key takeaways

After you master comprehensions

Next steps

Related posts

Filtering with `if`

Conditional expressions (`if` / `else`)