Binary Search | O(log n) Search Algorithm Complete Guide

Binary Search | O(log n) Search Algorithm Complete Guide

이 글의 핵심

Practical guide to binary search. Complete coverage of O(log n) search algorithm with detailed examples and problem-solving patterns.

Introduction

Binary search finds values in sorted arrays in O(log n) time. Much faster than linear search O(n).


1. Binary Search Basics

Algorithm

Find 7 in [1, 3, 5, 7, 9, 11, 13, 15, 17, 19]

1. Check middle: arr[4] = 9 > 7 → left half
2. Check middle: arr[1] = 3 < 7 → right half
3. Check middle: arr[2] = 5 < 7 → right half
4. Check middle: arr[3] = 7 = 7 → found!

Python Implementation

Binary search cuts the array in half to reduce search space. Like finding a word in a dictionary - open to the middle, check if target is before or after, discard half.

def binary_search(arr, target):
    left, right = 0, len(arr) - 1
    
    while left <= right:
        mid = (left + right) // 2
        
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid + 1
        else:
            right = mid - 1
    
    return -1

# Test
arr = [1, 3, 5, 7, 9, 11, 13, 15]
print(binary_search(arr, 7))   # 3
print(binary_search(arr, 10))  # -1

# Time complexity: O(log n)
# Search space halves each iteration
# n=1000 → max 10 comparisons
# n=1000000 → max 20 comparisons

Linear Search vs Binary Search:

# Linear search: O(n)
def linear_search(arr, target):
    for i in range(len(arr)):
        if arr[i] == target:
            return i
    return -1
# Worst case: check all elements (n times)

# Binary search: O(log n)
# Worst case: only log₂(n) checks
# 
# Performance comparison (n=1000000):
# Linear: 1000000 comparisons (worst)
# Binary: 20 comparisons (worst)
# → 50000x faster!

Recursive Implementation

def binary_search_recursive(arr, target, left, right):
    if left > right:
        return -1
    
    mid = (left + right) // 2
    
    if arr[mid] == target:
        return mid
    elif arr[mid] < target:
        return binary_search_recursive(arr, target, mid + 1, right)
    else:
        return binary_search_recursive(arr, target, left, mid - 1)

# Usage
arr = [1, 3, 5, 7, 9]
print(binary_search_recursive(arr, 7, 0, len(arr) - 1))  # 3

2. Lower Bound & Upper Bound

Lower Bound

Find first position ≥ x:

def lower_bound(arr, x):
    left, right = 0, len(arr)
    
    while left < right:
        mid = (left + right) // 2
        
        if arr[mid] < x:
            left = mid + 1
        else:
            right = mid
    
    return left

# Test
arr = [1, 2, 2, 2, 3, 4, 5]

print(lower_bound(arr, 2))  # 1 (first 2)
print(lower_bound(arr, 3))  # 4 (position of 3)
print(lower_bound(arr, 0))  # 0 (array start)
print(lower_bound(arr, 10))  # 7 (array end)

Lower Bound Usage:

# 1. Find first occurrence of duplicate
arr = [1, 2, 2, 2, 3, 4, 5]
first_2 = lower_bound(arr, 2)  # 1

# 2. Find insertion position (maintain sort)
arr = [1, 3, 5, 7, 9]
pos = lower_bound(arr, 6)  # 3
arr.insert(pos, 6)  # [1, 3, 5, 6, 7, 9]

# 3. Range query (count ≥ x)
arr = [1, 2, 3, 4, 5, 6, 7, 8, 9]
count_gte_5 = len(arr) - lower_bound(arr, 5)  # 5

Upper Bound

Find first position > x:

def upper_bound(arr, x):
    left, right = 0, len(arr)
    
    while left < right:
        mid = (left + right) // 2
        
        if arr[mid] <= x:
            left = mid + 1
        else:
            right = mid
    
    return left

# Test
arr = [1, 2, 2, 2, 3, 4, 5]

print(upper_bound(arr, 2))  # 4 (after 2)
print(upper_bound(arr, 3))  # 5 (after 3)
print(upper_bound(arr, 5))  # 7 (array end)

Upper Bound Usage:

# 1. Count occurrences
def count_occurrences(arr, x):
    return upper_bound(arr, x) - lower_bound(arr, x)

arr = [1, 2, 2, 2, 3, 4, 5]
print(count_occurrences(arr, 2))  # 3

# 2. Range query (count ≤ x)
arr = [1, 2, 3, 4, 5, 6, 7, 8, 9]
count_lte_5 = upper_bound(arr, 5)  # 5

# 3. Insert after duplicates
arr = [1, 2, 2, 2, 3]
pos = upper_bound(arr, 2)  # 4
arr.insert(pos, 2)  # [1, 2, 2, 2, 2, 3]

C++ STL Functions:

#include <algorithm>
#include <vector>

std::vector<int> arr = {1, 2, 2, 2, 3, 4, 5};

// Lower Bound: first position ≥ x
auto it1 = std::lower_bound(arr.begin(), arr.end(), 2);

// Upper Bound: first position > x
auto it2 = std::upper_bound(arr.begin(), arr.end(), 2);

// Count occurrences
int count = it2 - it1;  // 3

Concept

Convert optimization problem → decision problem:

"What's the minimum?" → "Is k possible?" (binary search)

Example: Cutting Trees

def cut_trees(trees, h):
    """
    Total wood obtained when cutting at height h
    """
    return sum(max(0, tree - h) for tree in trees)

def find_max_height(trees, target):
    """
    Find maximum height that yields at least target wood
    """
    left, right = 0, max(trees)
    result = 0
    
    while left <= right:
        mid = (left + right) // 2
        
        if cut_trees(trees, mid) >= target:
            result = mid
            left = mid + 1  # Try higher
        else:
            right = mid - 1  # Try lower
    
    return result

# Test
trees = [20, 15, 10, 17]
target = 7
print(find_max_height(trees, target))  # 15
# Cut at 15: 5 + 0 + 0 + 2 = 7

Example: Minimum Speed to Arrive on Time

def min_speed_on_time(dist, hour):
    """
    Find minimum speed to complete all distances within hour
    """
    def can_arrive(speed):
        time = 0
        for i, d in enumerate(dist):
            if i < len(dist) - 1:
                time += (d + speed - 1) // speed  # Ceiling
            else:
                time += d / speed
        return time <= hour
    
    left, right = 1, 10**7
    result = -1
    
    while left <= right:
        mid = (left + right) // 2
        
        if can_arrive(mid):
            result = mid
            right = mid - 1  # Try slower
        else:
            left = mid + 1  # Need faster
    
    return result

4. Practical Tips

Using Python Built-ins

import bisect

arr = [1, 2, 3, 5, 6, 7]

# Lower bound
idx = bisect.bisect_left(arr, 4)
print(idx)  # 3

# Upper bound
idx = bisect.bisect_right(arr, 4)
print(idx)  # 3 (same for non-existent)

# Insert maintaining sort
bisect.insort(arr, 4)
print(arr)  # [1, 2, 3, 4, 5, 6, 7]

Common Mistakes

# ❌ Wrong: Overflow in C++
# mid = (left + right) / 2;  // Overflow!

# ✅ Correct: Safe method
# mid = left + (right - left) / 2;

# Python has no overflow concern
mid = (left + right) // 2

# ❌ Wrong: Off-by-one
# while left < right:  # vs left <= right
# return left vs return -1

# ✅ Correct: Consistent bounds
# [left, right] inclusive: left <= right
# [left, right) half-open: left < right

Summary

Key Points

  1. Binary Search: O(log n) search in sorted arrays
  2. Lower Bound: First position ≥ x
  3. Upper Bound: First position > x
  4. Parametric Search: Optimization → decision problem
  5. Time Complexity: O(log n) - halves search space each step

When to Use

Problem TypeApproachReason
Find in sorted arrayBinary SearchO(log n) vs O(n)
Count occurrencesUpper - LowerO(log n)
Insert positionLower/Upper BoundMaintain sort
Optimization problemParametric SearchConvert to decision

Beginner

  • LeetCode 704: Binary Search
  • LeetCode 35: Search Insert Position
  • LeetCode 278: First Bad Version

Intermediate

  • LeetCode 34: Find First and Last Position
  • LeetCode 69: Sqrt(x)
  • LeetCode 74: Search a 2D Matrix

Advanced

  • LeetCode 4: Median of Two Sorted Arrays
  • LeetCode 410: Split Array Largest Sum
  • LeetCode 875: Koko Eating Bananas

  • Sorting Algorithms | Complete Guide
  • Arrays and Lists | Essential Data Structures
  • Stack and Queue | LIFO and FIFO

Keywords

Binary Search, Algorithm, O(log n), Lower Bound, Upper Bound, Parametric Search, Sorted Array, Search Algorithm, Coding Interview, Time Complexity