Mastering Python Iteration: A Comprehensive Guide to Loops, Iterables, and Generators

Python’s iteration capabilities are among its most powerful features, enabling developers to write clean, efficient, and expressive code. Whether you’re processing data, building algorithms, or working with collections, understanding iteration is fundamental to writing Pythonic code. This comprehensive guide will take you from basic loop concepts to advanced iterator patterns, helping you master one of Python’s core strengths.

Table of Contents #

  1. Understanding Basic Loops
  2. Advanced Iteration Techniques
  3. List Comprehensions and Generator Expressions
  4. Custom Iterators and Generators
  5. The itertools Module
  6. Performance Optimization and Best Practices
  7. Common Pitfalls and How to Avoid Them
  8. Practical Examples and Exercises

Understanding Basic Loops #

Python’s iteration model is built on the concept of iterables—objects that can return their elements one at a time. Let’s explore the fundamental iteration constructs that every Python developer should master.

The for Loop: Python’s Primary Iteration Tool #

The for loop is the most common way to iterate in Python. Unlike many other languages that use index-based iteration, Python’s for loop works directly with iterable objects, making code more readable and less error-prone.

# Iterating over a list
fruits = ['apple', 'banana', 'cherry', 'date']
for fruit in fruits:
    print(f"Processing {fruit}")
    # Output: Processing apple, Processing banana, etc.

# Iterating over a string (strings are iterable!)
word = "Python"
for character in word:
    print(character.upper())
    # Output: P, Y, T, H, O, N

# Iterating over a dictionary
person = {'name': 'Alice', 'age': 30, 'city': 'Beijing'}
for key, value in person.items():
    print(f"{key}: {value}")

# Iterating over dictionary keys only
for key in person:
    print(key)

# Iterating over dictionary values only
for value in person.values():
    print(value)

Understanding range(): Generating Numeric Sequences #

The range() function is a memory-efficient way to generate sequences of numbers. It doesn’t create a list in memory but instead generates numbers on-the-fly as you iterate.

# Basic range usage
for i in range(5):
    print(i)  # Prints: 0, 1, 2, 3, 4

# range with start and stop
for i in range(2, 7):
    print(i)  # Prints: 2, 3, 4, 5, 6

# range with start, stop, and step
for i in range(1, 10, 2):
    print(i)  # Prints: 1, 3, 5, 7, 9

# Counting backwards
for i in range(10, 0, -1):
    print(i)  # Prints: 10, 9, 8, 7, 6, 5, 4, 3, 2, 1

# Creating a list from range (if you really need it)
numbers = list(range(5))  # [0, 1, 2, 3, 4]

# Using range with len() for index-based iteration
# (Note: usually enumerate() is better for this)
items = ['a', 'b', 'c']
for i in range(len(items)):
    print(f"Index {i}: {items[i]}")

The while Loop: Condition-Based Iteration #

While for loops are more common in Python, while loops are essential when you need to iterate based on a condition rather than a sequence.

# Basic while loop
count = 0
while count < 5:
    print(f"Count is {count}")
    count += 1

# while loop with break
while True:
    user_input = input("Enter 'quit' to exit: ")
    if user_input == 'quit':
        break
    print(f"You entered: {user_input}")

# while loop with continue
number = 0
while number < 10:
    number += 1
    if number % 2 == 0:
        continue  # Skip even numbers
    print(number)  # Only prints odd numbers

Advanced Iteration Techniques #

Once you’ve mastered basic loops, Python offers several powerful built-in functions that make iteration more elegant and efficient.

enumerate(): Access Both Index and Value #

The enumerate() function is essential when you need both the index and the value during iteration. It returns tuples containing the index and value for each element.

colors = ['red', 'green', 'blue', 'yellow']

# Basic enumeration (index starts at 0)
for index, color in enumerate(colors):
    print(f"Color {index}: {color}")
    # Output: Color 0: red, Color 1: green, etc.

# Start enumeration at a custom number
for index, color in enumerate(colors, start=1):
    print(f"Color #{index}: {color}")
    # Output: Color #1: red, Color #2: green, etc.

# Practical example: finding positions of specific elements
text = "hello world"
vowels = 'aeiou'
vowel_positions = [(idx, char) for idx, char in enumerate(text) if char in vowels]
print(vowel_positions)  # [(1, 'e'), (4, 'o'), (7, 'o')]

# Creating a dictionary with enumeration
fruits = ['apple', 'banana', 'cherry']
fruit_dict = {idx: fruit for idx, fruit in enumerate(fruits)}
print(fruit_dict)  # {0: 'apple', 1: 'banana', 2: 'cherry'}

zip(): Parallel Iteration Over Multiple Sequences #

The zip() function allows you to iterate over multiple sequences simultaneously, pairing up elements from each sequence.

names = ['Alice', 'Bob', 'Charlie']
ages = [25, 30, 35]
cities = ['Shanghai', 'London', 'Paris']

# Zip two sequences
for name, age in zip(names, ages):
    print(f"{name} is {age} years old")

# Zip three or more sequences
for name, age, city in zip(names, ages, cities):
    print(f"{name} is {age} years old and lives in {city}")

# Create a dictionary from two lists
person_dict = dict(zip(names, ages))
print(person_dict)  # {'Alice': 25, 'Bob': 30, 'Charlie': 35}

# Important: zip stops at the shortest sequence
numbers = [1, 2, 3, 4, 5]
letters = ['a', 'b', 'c']
pairs = list(zip(numbers, letters))
print(pairs)  # [(1, 'a'), (2, 'b'), (3, 'c')] - stops at 3

# Using zip_longest for different length sequences
from itertools import zip_longest

numbers = [1, 2]
letters = ['a', 'b', 'c', 'd']
for num, letter in zip_longest(numbers, letters, fillvalue=None):
    print(f"Number: {num}, Letter: {letter}")
# Output includes: Number: None, Letter: c

# Unzipping: converting columns to rows
pairs = [(1, 'a'), (2, 'b'), (3, 'c')]
numbers, letters = zip(*pairs)
print(numbers)  # (1, 2, 3)
print(letters)  # ('a', 'b', 'c')

reversed(): Iterate in Reverse Order #

The reversed() function returns a reverse iterator without creating a new list in memory.

numbers = [1, 2, 3, 4, 5]

# Iterate in reverse
for num in reversed(numbers):
    print(num)  # Prints: 5, 4, 3, 2, 1

# Works with strings
word = "Python"
for char in reversed(word):
    print(char)  # Prints: n, o, h, t, y, P

# Create a reversed list if needed
reversed_list = list(reversed(numbers))
print(reversed_list)  # [5, 4, 3, 2, 1]

sorted(): Iterate Over Sorted Sequences #

The sorted() function returns a new sorted list from any iterable, allowing you to iterate over elements in order.

numbers = [3, 1, 4, 1, 5, 9, 2, 6]

# Iterate over sorted sequence
for num in sorted(numbers):
    print(num)  # Prints in ascending order

# Sort in reverse
for num in sorted(numbers, reverse=True):
    print(num)  # Prints in descending order

# Sort with a custom key
words = ['apple', 'pie', 'a', 'longer']
for word in sorted(words, key=len):
    print(word)  # Sorts by length: a, pie, apple, longer

# Sort dictionary by values
scores = {'Alice': 85, 'Bob': 92, 'Charlie': 78}
for name, score in sorted(scores.items(), key=lambda x: x[1], reverse=True):
    print(f"{name}: {score}")  # Prints in descending score order

List Comprehensions and Generator Expressions #

Comprehensions are one of Python’s most elegant features, allowing you to create new sequences in a single, readable line of code.

List Comprehensions: Elegant List Creation #

List comprehensions provide a concise way to create lists based on existing sequences or ranges.

# Basic list comprehension
numbers = [1, 2, 3, 4, 5]
squares = [n ** 2 for n in numbers]
print(squares)  # [1, 4, 9, 16, 25]

# List comprehension with conditional
even_squares = [n ** 2 for n in numbers if n % 2 == 0]
print(even_squares)  # [4, 16]

# List comprehension with if-else
parity = ['even' if n % 2 == 0 else 'odd' for n in numbers]
print(parity)  # ['odd', 'even', 'odd', 'even', 'odd']

# Nested list comprehension
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flattened = [num for row in matrix for num in row]
print(flattened)  # [1, 2, 3, 4, 5, 6, 7, 8, 9]

# Creating a matrix with nested comprehension
matrix = [[i * j for j in range(1, 4)] for i in range(1, 4)]
print(matrix)  # [[1, 2, 3], [2, 4, 6], [3, 6, 9]]

# String manipulation with list comprehension
words = ['hello', 'world', 'python']
capitalized = [word.capitalize() for word in words]
print(capitalized)  # ['Hello', 'World', 'Python']

Dictionary and Set Comprehensions #

Python also supports comprehensions for dictionaries and sets.

# Dictionary comprehension
numbers = [1, 2, 3, 4, 5]
square_dict = {n: n ** 2 for n in numbers}
print(square_dict)  # {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}

# Dictionary comprehension with conditional
even_square_dict = {n: n ** 2 for n in numbers if n % 2 == 0}
print(even_square_dict)  # {2: 4, 4: 16}

# Swapping keys and values
original = {'a': 1, 'b': 2, 'c': 3}
swapped = {value: key for key, value in original.items()}
print(swapped)  # {1: 'a', 2: 'b', 3: 'c'}

# Set comprehension
sentence = "hello world"
unique_letters = {char.lower() for char in sentence if char.isalpha()}
print(unique_letters)  # {'h', 'e', 'l', 'o', 'w', 'r', 'd'}

# Set comprehension for filtering
numbers = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
unique_evens = {n for n in numbers if n % 2 == 0}
print(unique_evens)  # {2, 4}

Generator Expressions: Memory-Efficient Iteration #

Generator expressions look like list comprehensions but use parentheses instead of square brackets. They generate values on-the-fly rather than creating a list in memory.

# Generator expression
numbers = range(1000000)
squares_gen = (n ** 2 for n in numbers)

# Generators are memory efficient - they don't store all values
print(squares_gen)  # <generator object at 0x...>

# You can iterate over a generator
for square in (n ** 2 for n in range(5)):
    print(square)  # Prints: 0, 1, 4, 9, 16

# Generators can only be iterated once
gen = (n for n in range(3))
print(list(gen))  # [0, 1, 2]
print(list(gen))  # [] - generator is exhausted!

# Use generators with functions that accept iterables
sum_of_squares = sum(n ** 2 for n in range(1000000))
max_value = max(n for n in range(100) if n % 7 == 0)

# Generator expression with conditional
even_numbers = (n for n in range(20) if n % 2 == 0)
print(list(even_numbers))  # [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

Custom Iterators and Generators #

Understanding how to create your own iterators and generators gives you powerful control over iteration behavior.

Creating Custom Iterator Classes #

To create a custom iterator, you need to implement the __iter__() and __next__() methods.

class CountDown:
    """Iterator that counts down from a starting number."""
    
    def __init__(self, start):
        self.current = start
    
    def __iter__(self):
        # Return the iterator object (self)
        return self
    
    def __next__(self):
        if self.current <= 0:
            raise StopIteration  # Signal end of iteration
        
        result = self.current
        self.current += timedelta(days=1)
        return result

# Usage
start = date(2024, 1, 1)
end = date(2024, 1, 7)
for day in DateRange(start, end):
    print(day.strftime('%Y-%m-%d'))

# Generator version (simpler!)
def date_range(start_date, end_date):
    """Generator function for date ranges."""
    current = start_date
    while current <= end_date:
        yield current
        current += timedelta(days=1)

# Usage
for day in date_range(date(2024, 1, 1), date(2024, 1, 7)):
    print(day.strftime('%Y-%m-%d'))

Example 2: Processing CSV-Like Data #

def parse_csv_lines(lines):
    """Generator that parses CSV lines efficiently."""
    for line in lines:
        yield line.strip().split(',')

# Simulating CSV data
csv_data = [
    'Name,Age,City',
    'Alice,25,Shanghai',
    'Bob,30,Beijing',
    'Charlie,35,Shenzhen'
]

# Process header and data separately
lines = iter(csv_data)
header = next(lines).split(',')

# Process each row
for row in parse_csv_lines(lines):
    person = dict(zip(header, row))
    print(person)
# Output: {'Name': 'Alice', 'Age': '25', 'City': 'Shanghai'}, etc.

Example 3: Batch Processing with Iterators #

def batch(iterable, batch_size):
    """
    Generator that yields batches of items from an iterable.
    Useful for processing large datasets in chunks.
    """
    batch_list = []
    for item in iterable:
        batch_list.append(item)
        if len(batch_list) == batch_size:
            yield batch_list
            batch_list = []
    
    # Don't forget the last partial batch
    if batch_list:
        yield batch_list

# Usage: process data in batches of 3
data = range(10)
for batch_items in batch(data, 3):
    print(f"Processing batch: {batch_items}")
    # Output: [0, 1, 2], [3, 4, 5], [6, 7, 8], [9]

# Alternative using itertools
from itertools import islice

def batch_itertools(iterable, batch_size):
    """More efficient batch generator using itertools."""
    iterator = iter(iterable)
    while True:
        batch_list = list(islice(iterator, batch_size))
        if not batch_list:
            break
        yield batch_list

# Usage
for batch_items in batch_itertools(range(10), 3):
    print(f"Processing batch: {batch_items}")

Example 4: Sliding Window Iterator #

from collections import deque

def sliding_window(iterable, window_size):
    """
    Generator that yields sliding windows of items.
    Useful for analyzing sequences and time series data.
    """
    iterator = iter(iterable)
    window = deque(maxlen=window_size)
    
    # Fill the initial window
    for _ in range(window_size):
        try:
            window.append(next(iterator))
        except StopIteration:
            return
    
    yield list(window)
    
    # Slide the window
    for item in iterator:
        window.append(item)
        yield list(window)

# Usage: moving average calculation
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
for window in sliding_window(numbers, 3):
    average = sum(window) / len(window)
    print(f"Window: {window}, Average: {average:.2f}")
# Output: Window: [1, 2, 3], Average: 2.00
#         Window: [2, 3, 4], Average: 3.00, etc.

Example 5: Filtering and Transforming Data Pipeline #

def read_numbers(filename):
    """Generator: read numbers from file."""
    with open(filename) as f:
        for line in f:
            try:
                yield int(line.strip())
            except ValueError:
                continue  # Skip invalid lines

def filter_positive(numbers):
    """Generator: filter positive numbers."""
    for num in numbers:
        if num > 0:
            yield num

def square(numbers):
    """Generator: square each number."""
    for num in numbers:
        yield num ** 2

def take_first_n(iterable, n):
    """Generator: take first n items."""
    for i, item in enumerate(iterable):
        if i >= n:
            break
        yield item

# Create a processing pipeline
# pipeline = take_first_n(square(filter_positive(read_numbers('data.txt'))), 5)

# Or using a more readable approach
def process_numbers(filename, n):
    """Process numbers through a pipeline."""
    numbers = read_numbers(filename)
    positive = filter_positive(numbers)
    squared = square(positive)
    result = take_first_n(squared, n)
    return list(result)

# This entire pipeline is memory-efficient because it uses generators!

Example 6: Tree Traversal Iterator #

class TreeNode:
    """Simple binary tree node."""
    def __init__(self, value, left=None, right=None):
        self.value = value
        self.left = left
        self.right = right

def traverse_inorder(node):
    """Generator for in-order tree traversal."""
    if node is not None:
        # Traverse left subtree
        yield from traverse_inorder(node.left)
        # Visit current node
        yield node.value
        # Traverse right subtree
        yield from traverse_inorder(node.right)

def traverse_preorder(node):
    """Generator for pre-order tree traversal."""
    if node is not None:
        yield node.value
        yield from traverse_preorder(node.left)
        yield from traverse_preorder(node.right)

def traverse_postorder(node):
    """Generator for post-order tree traversal."""
    if node is not None:
        yield from traverse_postorder(node.left)
        yield from traverse_postorder(node.right)
        yield node.value

# Build a sample tree
#       4
#      / \
#     2   6
#    / \ / \
#   1  3 5  7
root = TreeNode(4,
    TreeNode(2, TreeNode(1), TreeNode(3)),
    TreeNode(6, TreeNode(5), TreeNode(7))
)

# Traverse the tree
print("In-order:", list(traverse_inorder(root)))    # [1, 2, 3, 4, 5, 6, 7]
print("Pre-order:", list(traverse_preorder(root)))  # [4, 2, 1, 3, 6, 5, 7]
print("Post-order:", list(traverse_postorder(root))) # [1, 3, 2, 5, 7, 6, 4]

Example 7: Infinite Sequence Generators #

def fibonacci_infinite():
    """Generate Fibonacci numbers infinitely."""
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

def primes_infinite():
    """Generate prime numbers infinitely."""
    def is_prime(n):
        if n < 2:
            return False
        for i in range(2, int(n ** 0.5) + 1):
            if n % i == 0:
                return False
        return True
    
    n = 2
    while True:
        if is_prime(n):
            yield n
        n += 1

def collatz_sequence(n):
    """Generate Collatz sequence for a given number."""
    while n != 1:
        yield n
        if n % 2 == 0:
            n = n // 2
        else:
            n = 3 * n + 1
    yield 1

# Usage with islice to get finite results
from itertools import islice

# First 15 Fibonacci numbers
fibs = list(islice(fibonacci_infinite(), 15))
print(f"Fibonacci: {fibs}")

# First 10 primes
primes = list(islice(primes_infinite(), 10))
print(f"Primes: {primes}")

# Collatz sequence for 13
collatz = list(collatz_sequence(13))
print(f"Collatz(13): {collatz}")  # [13, 40, 20, 10, 5, 16, 8, 4, 2, 1]

Practice Exercises #

Challenge yourself with these exercises to reinforce your understanding:

Exercise 1: Chunk Text into Fixed-Length Pieces #

Write a generator that splits text into chunks of a specified length, preserving word boundaries when possible.

def chunk_text(text, chunk_size):
    """
    Split text into chunks of approximately chunk_size characters,
    preserving word boundaries.
    """
    words = text.split()
    current_chunk = []
    current_length = 0
    
    for word in words:
        word_length = len(word) + 1  # +1 for space
        
        if current_length + word_length > chunk_size and current_chunk:
            yield ' '.join(current_chunk)
            current_chunk = []
            current_length = 0
        
        current_chunk.append(word)
        current_length += word_length
    
    if current_chunk:
        yield ' '.join(current_chunk)

# Test
text = "Python iteration is powerful and elegant. It allows you to write clean and efficient code."
for i, chunk in enumerate(chunk_text(text, 30), 1):
    print(f"Chunk {i}: {chunk}")

Exercise 2: Flatten Nested Iterables #

Write a generator that flattens arbitrarily nested iterables.

def flatten(iterable):
    """
    Recursively flatten nested iterables.
    """
    for item in iterable:
        if isinstance(item, (list, tuple)):
            yield from flatten(item)
        else:
            yield item

# Test
nested = [1, [2, 3, [4, 5]], 6, [7, [8, 9]]]
flat = list(flatten(nested))
print(flat)  # [1, 2, 3, 4, 5, 6, 7, 8, 9]

Exercise 3: Running Statistics Generator #

Create a generator that yields running statistics (mean, min, max) as it processes numbers.

def running_stats(numbers):
    """
    Generator that yields running statistics for a sequence of numbers.
    """
    total = 0
    count = 0
    min_val = float('inf')
    max_val = float('-inf')
    
    for num in numbers:
        total += num
        count += 1
        min_val = min(min_val, num)
        max_val = max(max_val, num)
        mean = total / count
        
        yield {
            'count': count,
            'mean': mean,
            'min': min_val,
            'max': max_val,
            'current': num
        }

# Test
numbers = [10, 5, 20, 15, 8]
for stats in running_stats(numbers):
    print(f"After {stats['count']} numbers: "
          f"mean={stats['mean']:.2f}, "
          f"min={stats['min']}, "
          f"max={stats['max']}")

Exercise 4: Pairwise Iterator #

Create a generator that yields consecutive pairs from an iterable.

def pairwise(iterable):
    """
    Generate consecutive pairs from an iterable.
    s -> (s0,s1), (s1,s2), (s2, s3), ...
    """
    iterator = iter(iterable)
    try:
        prev = next(iterator)
    except StopIteration:
        return
    
    for item in iterator:
        yield prev, item
        prev = item

# Test
numbers = [1, 2, 3, 4, 5]
for pair in pairwise(numbers):
    print(pair)
# Output: (1, 2), (2, 3), (3, 4), (4, 5)

# Using itertools (Python 3.10+)
from itertools import pairwise as itertools_pairwise
pairs = list(itertools_pairwise(numbers))
print(pairs)

Exercise 5: Custom groupby Implementation #

Implement a simplified version of itertools.groupby.

def simple_groupby(iterable, key=None):
    """
    Simplified groupby that groups consecutive elements by key.
    """
    if key is None:
        key = lambda x: x
    
    iterator = iter(iterable)
    try:
        current_item = next(iterator)
    except StopIteration:
        return
    
    current_key = key(current_item)
    group = [current_item]
    
    for item in iterator:
        item_key = key(item)
        
        if item_key == current_key:
            group.append(item)
        else:
            yield current_key, group
            current_key = item_key
            group = [item]
    
    yield current_key, group

# Test
data = ['a', 'a', 'b', 'b', 'b', 'c', 'a', 'a']
for key, group in simple_groupby(data):
    print(f"{key}: {group}")
# Output: a: ['a', 'a']
#         b: ['b', 'b', 'b']
#         c: ['c']
#         a: ['a', 'a']

Summary and Best Practices #

As you’ve learned throughout this guide, Python’s iteration capabilities are both powerful and elegant. Here are the key takeaways to remember:

When to Use Each Iteration Tool #

  1. for loops: Default choice for most iteration tasks
  2. enumerate(): When you need both index and value
  3. zip(): For parallel iteration over multiple sequences
  4. List comprehensions: For creating new lists with transformations
  5. Generator expressions: For memory-efficient transformations
  6. Generator functions: For complex iteration logic or infinite sequences
  7. itertools: For advanced iteration patterns and combinations

Performance Guidelines #

  • Use generators for large datasets or when memory is a concern
  • Avoid modifying sequences while iterating over them
  • Prefer comprehensions over manual append loops for better performance
  • Use itertools functions for complex iteration patterns instead of manual loops
  • Batch process large datasets to balance memory and performance

Code Quality Tips #

  • Write clear, readable code - Python’s iteration tools enable expressive solutions
  • Choose the most appropriate tool for each task rather than forcing one approach
  • Use meaningful variable names in iteration (avoid single letters unless in mathematical contexts)
  • Add docstrings to custom iterators and generators explaining their behavior
  • Consider memory implications when choosing between lists and generators

Common Patterns to Remember #

# Iterate with index
for i, item in enumerate(items):
    pass

# Parallel iteration
for x, y in zip(list1, list2):
    pass

# Iterate in reverse
for item in reversed(items):
    pass

# Iterate over sorted items
for item in sorted(items):
    pass

# Filter and transform
result = [transform(x) for x in items if condition(x)]

# Generator for memory efficiency
gen = (transform(x) for x in large_dataset if condition(x))

Python’s iteration model is one of the language’s greatest strengths, enabling you to write code that is both efficient and readable. By mastering these concepts and patterns, you’ll be well-equipped to handle any iteration challenge that comes your way.

Remember: the best code is not just correct, but also clear, maintainable, and efficient. Python’s iteration tools help you achieve all three goals simultaneously.


Additional Resources #

Happy coding, and may your iterations always be efficient and elegant! -= 1 return result

Using the custom iterator #

for num in CountDown(5): print(num) # Prints: 5, 4, 3, 2, 1

More complex example: Fibonacci iterator #

class Fibonacci: “““Iterator that generates Fibonacci numbers.”””

def __init__(self, max_count):
    self.max_count = max_count
    self.count = 0
    self.a, self.b = 0, 1

def __iter__(self):
    return self

def __next__(self):
    if self.count >= self.max_count:
        raise StopIteration
    
    result = self.a
    self.a, self.b = self.b, self.a + self.b
    self.count += 1
    return result

Generate first 10 Fibonacci numbers #

for num in Fibonacci(10): print(num) # Prints: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34


### Generator Functions: The Elegant Approach

Generator functions use the `yield` keyword to produce values one at a time, making them much simpler to write than iterator classes.

```python
def countdown(start):
    """Generator function that counts down from start."""
    while start > 0:
        yield start
        start -= 1

# Using the generator
for num in countdown(5):
    print(num)  # Prints: 5, 4, 3, 2, 1

# Fibonacci generator
def fibonacci(n):
    """Generate first n Fibonacci numbers."""
    a, b = 0, 1
    for _ in range(n):
        yield a
        a, b = b, a + b

# Using the generator
fib_numbers = list(fibonacci(10))
print(fib_numbers)  # [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

# Infinite generator (use with caution!)
def count_forever(start=0, step=1):
    """Generate an infinite sequence of numbers."""
    current = start
    while True:
        yield current
        current += step

# Using with itertools.islice to get finite results
from itertools import islice
first_ten = list(islice(count_forever(), 10))
print(first_ten)  # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

# Generator with input processing
def process_lines(filename):
    """Generator that reads and processes lines from a file."""
    with open(filename, 'r') as f:
        for line in f:
            # Process each line without loading entire file into memory
            yield line.strip().upper()

# Prime number generator
def primes():
    """Generate an infinite sequence of prime numbers."""
    def is_prime(n):
        if n < 2:
            return False
        for i in range(2, int(n ** 0.5) + 1):
            if n % i == 0:
                return False
        return True
    
    n = 2
    while True:
        if is_prime(n):
            yield n
        n += 1

# Get first 10 primes
first_ten_primes = list(islice(primes(), 10))
print(first_ten_primes)  # [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]

Generator Expressions vs Generator Functions #

Both create generators, but they have different use cases.

# Generator expression: concise, single-line
squares = (x ** 2 for x in range(10))

# Generator function: more control, complex logic
def squares_gen(n):
    for x in range(n):
        print(f"Generating square of {x}")
        yield x ** 2

# Generator functions can have setup and teardown code
def file_reader(filename):
    print(f"Opening {filename}")
    with open(filename) as f:
        for line in f:
            yield line.strip()
    print(f"Closing {filename}")

The itertools Module #

The itertools module provides a collection of fast, memory-efficient tools for working with iterators.

Infinite Iterators #

from itertools import count, cycle, repeat

# count: infinite counter
for i in count(10, 2):
    if i > 20:
        break
    print(i)  # Prints: 10, 12, 14, 16, 18, 20

# cycle: cycle through an iterable infinitely
colors = ['red', 'green', 'blue']
color_cycle = cycle(colors)
for i, color in enumerate(color_cycle):
    if i >= 7:
        break
    print(color)  # Prints: red, green, blue, red, green, blue, red

# repeat: repeat an object
for item in repeat('Hello', 3):
    print(item)  # Prints: Hello, Hello, Hello

Combinatoric Iterators #

from itertools import (
    combinations, combinations_with_replacement,
    permutations, product
)

items = ['A', 'B', 'C']

# combinations: all r-length combinations
for combo in combinations(items, 2):
    print(combo)
# Output: ('A', 'B'), ('A', 'C'), ('B', 'C')

# combinations_with_replacement: combinations allowing repeated elements
for combo in combinations_with_replacement(items, 2):
    print(combo)
# Output: ('A', 'A'), ('A', 'B'), ('A', 'C'), ('B', 'B'), ('B', 'C'), ('C', 'C')

# permutations: all r-length permutations
for perm in permutations(items, 2):
    print(perm)
# Output: ('A', 'B'), ('A', 'C'), ('B', 'A'), ('B', 'C'), ('C', 'A'), ('C', 'B')

# product: cartesian product
for pair in product(['A', 'B'], [1, 2]):
    print(pair)
# Output: ('A', 1), ('A', 2), ('B', 1), ('B', 2)

# product with repeat
for triplet in product([0, 1], repeat=3):
    print(triplet)
# Output: All binary triplets from (0,0,0) to (1,1,1)

Filtering and Grouping Iterators #

from itertools import (
    chain, compress, dropwhile, filterfalse,
    groupby, islice, takewhile, tee
)

# chain: combine multiple iterables
list1 = [1, 2, 3]
list2 = [4, 5, 6]
for item in chain(list1, list2):
    print(item)  # Prints: 1, 2, 3, 4, 5, 6

# compress: filter based on boolean selectors
data = ['A', 'B', 'C', 'D']
selectors = [True, False, True, False]
result = list(compress(data, selectors))
print(result)  # ['A', 'C']

# dropwhile: drop elements while predicate is true
numbers = [1, 3, 5, 2, 4, 6]
result = list(dropwhile(lambda x: x % 2 == 1, numbers))
print(result)  # [2, 4, 6]

# takewhile: take elements while predicate is true
result = list(takewhile(lambda x: x % 2 == 1, numbers))
print(result)  # [1, 3, 5]

# filterfalse: opposite of filter
numbers = [1, 2, 3, 4, 5, 6]
odd_numbers = list(filterfalse(lambda x: x % 2 == 0, numbers))
print(odd_numbers)  # [1, 3, 5]

# islice: slice an iterator
numbers = range(100)
subset = list(islice(numbers, 5, 15, 2))
print(subset)  # [5, 7, 9, 11, 13]

# groupby: group consecutive elements by key
data = [('A', 1), ('A', 2), ('B', 3), ('B', 4), ('C', 5)]
for key, group in groupby(data, key=lambda x: x[0]):
    print(f"{key}: {list(group)}")
# Output: A: [('A', 1), ('A', 2)], B: [('B', 3), ('B', 4)], C: [('C', 5)]

# tee: create independent iterators from one
iterator = iter([1, 2, 3])
iter1, iter2 = tee(iterator, 2)
print(list(iter1))  # [1, 2, 3]
print(list(iter2))  # [1, 2, 3]

Performance Optimization and Best Practices #

Understanding performance implications helps you write efficient iteration code.

Memory Efficiency: Generators vs Lists #

import sys

# Lists consume memory proportional to their size
large_list = [x ** 2 for x in range(1000000)]
print(f"List size: {sys.getsizeof(large_list)} bytes")

# Generators consume constant memory
large_gen = (x ** 2 for x in range(1000000))
print(f"Generator size: {sys.getsizeof(large_gen)} bytes")

# Example: Processing large files
# Bad: loads entire file into memory
def read_file_bad(filename):
    with open(filename) as f:
        lines = f.readlines()  # All lines in memory
    return [line.strip() for line in lines]

# Good: processes line by line
def read_file_good(filename):
    with open(filename) as f:
        for line in f:  # One line at a time
            yield line.strip()

# Example: Chaining operations efficiently
# Bad: creates intermediate lists
numbers = range(1000000)
squares = [x ** 2 for x in numbers]
even_squares = [x for x in squares if x % 2 == 0]
result = sum(even_squares)

# Good: uses generators throughout
numbers = range(1000000)
squares = (x ** 2 for x in numbers)
even_squares = (x for x in squares if x % 2 == 0)
result = sum(even_squares)

Choosing the Right Iteration Tool #

# Use enumerate when you need indices
items = ['apple', 'banana', 'cherry']
for index, item in enumerate(items):
    print(f"{index}: {item}")

# Use zip for parallel iteration
names = ['Alice', 'Bob']
ages = [25, 30]
for name, age in zip(names, ages):
    print(f"{name}: {age}")

# Use map for simple transformations
numbers = [1, 2, 3, 4]
squared = list(map(lambda x: x ** 2, numbers))

# Use filter for simple filtering
evens = list(filter(lambda x: x % 2 == 0, numbers))

# Use comprehensions for complex transformations
result = [x ** 2 for x in numbers if x % 2 == 0]

# Use itertools for advanced iteration patterns
from itertools import accumulate
cumulative = list(accumulate(numbers))  # [1, 3, 6, 10]

Loop Optimization Techniques #

# Avoid repeated attribute lookups
# Bad
for i in range(len(my_list)):
    my_list.append(my_list[i] * 2)  # len() called repeatedly

# Good
list_length = len(my_list)
for i in range(list_length):
    my_list.append(my_list[i] * 2)

# Use local variables for frequently accessed values
# Bad
for item in items:
    result.append(math.sqrt(item))  # Repeated attribute lookup

# Good
sqrt = math.sqrt
for item in items:
    result.append(sqrt(item))

# Prefer list comprehensions over append loops
# Bad
squares = []
for x in range(100):
    squares.append(x ** 2)

# Good
squares = [x ** 2 for x in range(100)]

Common Pitfalls and How to Avoid Them #

Learn from common mistakes to write more robust iteration code.

Don’t Modify Sequences While Iterating #

# WRONG: Modifying list while iterating
numbers = [1, 2, 3, 4, 5, 6]
for num in numbers:
    if num % 2 == 0:
        numbers.remove(num)  # This causes unexpected behavior!

# Correct: Create a new list
numbers = [1, 2, 3, 4, 5, 6]
numbers = [num for num in numbers if num % 2 != 0]

# Or: Iterate over a copy
numbers = [1, 2, 3, 4, 5, 6]
for num in numbers[:]:  # Iterate over a copy
    if num % 2 == 0:
        numbers.remove(num)

# For dictionaries: iterate over a copy of keys
person = {'name': 'Alice', 'age': 30, 'city': 'Beijing'}
for key in list(person.keys()):  # Create a list copy
    if key.startswith('c'):
        del person[key]

Exhausted Generators #

# Generators can only be iterated once
gen = (x ** 2 for x in range(5))
print(list(gen))  # [0, 1, 4, 9, 16]
print(list(gen))  # [] - Generator is exhausted!

# Solution: recreate the generator or convert to a list
def make_gen():
    return (x ** 2 for x in range(5))

gen1 = make_gen()
print(list(gen1))  # [0, 1, 4, 9, 16]
gen2 = make_gen()
print(list(gen2))  # [0, 1, 4, 9, 16]

# Or convert to list if you need multiple iterations
gen = (x ** 2 for x in range(5))
values = list(gen)
print(values)  # [0, 1, 4, 9, 16]
print(values)  # [0, 1, 4, 9, 16] - List can be reused

Avoiding Unnecessary List Creation #

# Bad: Creating intermediate lists
numbers = range(1000000)
result = list(map(str, numbers))  # Creates large list
for num_str in result:
    process(num_str)

# Good: Use generator or iterator directly
numbers = range(1000000)
for num_str in map(str, numbers):
    process(num_str)

# Bad: Using list() when not needed
total = sum(list(x ** 2 for x in range(1000)))

# Good: sum() accepts generators
total = sum(x ** 2 for x in range(1000))

Dictionary Iteration Best Practices #

# Iterating over keys (default behavior)
scores = {'Alice': 85, 'Bob': 92, 'Charlie': 78}
for name in scores:
    print(name)  # Just the keys

# Iterating over values
for score in scores.values():
    print(score)

# Iterating over key-value pairs
for name, score in scores.items():
    print(f"{name}: {score}")

# Getting default values during iteration
for name in ['Alice', 'David']:
    score = scores.get(name, 0)  # 0 if name not found
    print(f"{name}: {score}")

Practical Examples and Exercises #

Let’s apply what we’ve learned to real-world scenarios.

Example 1: Custom Date Range Iterator #

from datetime import date, timedelta

class DateRange:
    """Iterator for date ranges."""
    
    def __init__(self, start_date, end_date):
        self.current = start_date
        self.end_date = end_date
    
    def __iter__(self):
        return self
    
    def __next__(self):
        if self.current > self.end_date:
            raise StopIteration
        
        result = self.current
        self.current