Python Iterators

Learn about iterators in Python and how to create your own iterable objects.

Python Iterators

An iterator is an object that contains a countable number of values.

An iterator is an object that can be iterated upon, meaning that you can traverse through all the values.

Technically, in Python, an iterator is an object which implements the iterator protocol, which consist of the methods __iter__() and __next__().

Iterator vs Iterable

Lists, tuples, dictionaries, and sets are all iterable objects. They are iterable containers which you can get an iterator from.

All these objects have a iter() method which is used to get an iterator:

Example - Return an iterator from a tuple, and print each value:

mytuple = ("apple", "banana", "cherry")
myit = iter(mytuple)

print(next(myit))
print(next(myit))
print(next(myit))

Even strings are iterable objects, and can return an iterator:

Example - Strings are also iterable objects, containing a sequence of characters:

mystr = "banana"
myit = iter(mystr)

print(next(myit))
print(next(myit))
print(next(myit))
print(next(myit))
print(next(myit))
print(next(myit))

Looping Through an Iterator

We can also use a for loop to iterate through an iterable object:

Example - Iterate the values of a tuple:

mytuple = ("apple", "banana", "cherry")

for x in mytuple:
    print(x)

Example - Iterate the characters of a string:

mystr = "banana"

for x in mystr:
    print(x)

The for loop actually creates an iterator object and executes the next() method for each loop.

Create an Iterator

To create an object/class as an iterator you have to implement the methods __iter__() and __next__() to your object.

As you have learned in the Python Classes/Objects chapter, all classes have a function called __init__(), which allows you to do some initializing when the object is being created.

The __iter__() method acts similar, you can do operations (initializing etc.), but must always return the iterator object itself.

The __next__() method also allows you to do operations, and must return the next item in the sequence.

Example - Create an iterator that returns numbers, starting with 1, and each sequence will increase by one (returning 1,2,3,4,5 etc.):

class MyNumbers:
    def __iter__(self):
        self.a = 1
        return self

    def __next__(self):
        x = self.a
        self.a += 1
        return x

myclass = MyNumbers()
myiter = iter(myclass)

print(next(myiter))
print(next(myiter))
print(next(myiter))
print(next(myiter))
print(next(myiter))

StopIteration

The example above would continue forever if you had enough next() statements, or if it was used in a for loop.

To prevent the iteration to go on forever, we can use the StopIteration statement.

In the __next__() method, we can add a terminating condition to raise an error if the iteration is done a specified number of times:

Example - Stop after 20 iterations:

class MyNumbers:
    def __iter__(self):
        self.a = 1
        return self

    def __next__(self):
        if self.a <= 20:
            x = self.a
            self.a += 1
            return x
        else:
            raise StopIteration

myclass = MyNumbers()
myiter = iter(myclass)

for x in myiter:
    print(x)

Practical Iterator Examples

Fibonacci Iterator

class Fibonacci:
    def __init__(self, max_count):
        self.max_count = max_count
        self.count = 0
        self.current = 0
        self.next_val = 1
    
    def __iter__(self):
        return self
    
    def __next__(self):
        if self.count < self.max_count:
            result = self.current
            self.current, self.next_val = self.next_val, self.current + self.next_val
            self.count += 1
            return result
        else:
            raise StopIteration

# Usage
fib = Fibonacci(10)
for num in fib:
    print(num, end=" ")
print()  # 0 1 1 2 3 5 8 13 21 34

Even Numbers Iterator

class EvenNumbers:
    def __init__(self, start=0, end=100):
        self.start = start if start % 2 == 0 else start + 1
        self.end = end
        self.current = self.start
    
    def __iter__(self):
        return self
    
    def __next__(self):
        if self.current <= self.end:
            result = self.current
            self.current += 2
            return result
        else:
            raise StopIteration

# Usage
even_nums = EvenNumbers(10, 20)
for num in even_nums:
    print(num, end=" ")
print()  # 10 12 14 16 18 20

File Line Iterator

class FileLineIterator:
    def __init__(self, filename):
        self.filename = filename
        self.file = None
    
    def __iter__(self):
        self.file = open(self.filename, 'r')
        return self
    
    def __next__(self):
        if self.file:
            line = self.file.readline()
            if line:
                return line.strip()
            else:
                self.file.close()
                raise StopIteration
        else:
            raise StopIteration

# Usage (assuming you have a file called 'data.txt')
# file_iter = FileLineIterator('data.txt')
# for line in file_iter:
#     print(line)

Built-in Iterator Functions

enumerate()

Returns an iterator that produces tuples containing indices and values:

fruits = ['apple', 'banana', 'cherry']

# Using enumerate
for index, fruit in enumerate(fruits):
    print(f"{index}: {fruit}")

# Starting from a different number
for index, fruit in enumerate(fruits, start=1):
    print(f"{index}: {fruit}")

zip()

Combines multiple iterables into tuples:

names = ['Alice', 'Bob', 'Charlie']
ages = [25, 30, 35]
cities = ['New York', 'London', 'Tokyo']

# Zip multiple lists
for name, age, city in zip(names, ages, cities):
    print(f"{name} is {age} years old and lives in {city}")

# Zip with different lengths (stops at shortest)
numbers = [1, 2, 3, 4, 5]
letters = ['a', 'b', 'c']
for num, letter in zip(numbers, letters):
    print(f"{num}: {letter}")  # Only prints 3 pairs

reversed()

Returns an iterator that accesses the sequence in reverse order:

numbers = [1, 2, 3, 4, 5]

# Reverse iteration
for num in reversed(numbers):
    print(num, end=" ")
print()  # 5 4 3 2 1

# Works with strings too
for char in reversed("hello"):
    print(char, end="")
print()  # olleh

Generator Functions

Generators are a simpler way to create iterators using the yield keyword:

Example - Simple generator function:

def count_up_to(max_count):
    count = 1
    while count <= max_count:
        yield count
        count += 1

# Usage
counter = count_up_to(5)
for num in counter:
    print(num, end=" ")
print()  # 1 2 3 4 5

Example - Fibonacci generator:

def fibonacci_generator(max_count):
    a, b = 0, 1
    count = 0
    while count < max_count:
        yield a
        a, b = b, a + b
        count += 1

# Usage
fib_gen = fibonacci_generator(10)
for num in fib_gen:
    print(num, end=" ")
print()  # 0 1 1 2 3 5 8 13 21 34

Example - Reading large files efficiently:

def read_large_file(filename):
    """Generator to read large files line by line without loading entire file into memory"""
    with open(filename, 'r') as file:
        for line in file:
            yield line.strip()

# Usage
# for line in read_large_file('large_file.txt'):
#     process_line(line)  # Process one line at a time

Generator Expressions

Generator expressions provide a concise way to create generators:

# Generator expression for squares
squares = (x**2 for x in range(10))
for square in squares:
    print(square, end=" ")
print()  # 0 1 4 9 16 25 36 49 64 81

# Generator expression with condition
even_squares = (x**2 for x in range(10) if x % 2 == 0)
for square in even_squares:
    print(square, end=" ")
print()  # 0 4 16 36 64

# Memory efficient processing
def process_numbers(numbers):
    return (x * 2 for x in numbers if x > 5)

large_numbers = range(1000000)
processed = process_numbers(large_numbers)
# Only processes numbers as needed, not all at once

Iterator Tools

The itertools module provides many useful iterator functions:

import itertools

# count() - infinite counting
counter = itertools.count(10, 2)  # Start at 10, step by 2
for i, num in enumerate(counter):
    if i >= 5:
        break
    print(num, end=" ")
print()  # 10 12 14 16 18

# cycle() - infinite cycling through sequence
colors = itertools.cycle(['red', 'green', 'blue'])
for i, color in enumerate(colors):
    if i >= 7:
        break
    print(color, end=" ")
print()  # red green blue red green blue red

# repeat() - repeat a value
repeated = itertools.repeat('hello', 3)
for item in repeated:
    print(item, end=" ")
print()  # hello hello hello

# chain() - chain multiple iterables
list1 = [1, 2, 3]
list2 = [4, 5, 6]
list3 = [7, 8, 9]
chained = itertools.chain(list1, list2, list3)
for item in chained:
    print(item, end=" ")
print()  # 1 2 3 4 5 6 7 8 9

# combinations() - generate combinations
from itertools import combinations
items = ['A', 'B', 'C', 'D']
for combo in combinations(items, 2):
    print(combo, end=" ")
print()  # ('A', 'B') ('A', 'C') ('A', 'D') ('B', 'C') ('B', 'D') ('C', 'D')

Performance Benefits

Iterators and generators provide memory efficiency and lazy evaluation:

Example - Memory comparison:

import sys

# List comprehension - creates entire list in memory
list_comp = [x**2 for x in range(1000000)]
print(f"List size: {sys.getsizeof(list_comp)} bytes")

# Generator expression - creates iterator, not full list
gen_exp = (x**2 for x in range(1000000))
print(f"Generator size: {sys.getsizeof(gen_exp)} bytes")

# The generator uses much less memory!

# Processing large datasets efficiently
def process_large_dataset():
    # Instead of loading all data into memory
    # data = load_all_data()  # Could use lots of memory
    # for item in data:
    #     process(item)
    
    # Use generator to process one item at a time
    for item in data_generator():
        yield process_item(item)

def data_generator():
    # Simulate reading data one piece at a time
    for i in range(1000000):
        yield f"data_item_{i}"

def process_item(item):
    return item.upper()

# Usage
processed_data = process_large_dataset()
# Only processes items as needed, not all at once