Generators are a special kind of function that returns a lazy iterator. These are objects that you can loop over like a list. Unlike lists, lazy iterators do not store their data in memory.
Before diving in generators, we need first to elaborate a bit on the iterators concept and the yield keyword.
Iterators
An iterator is an object that can be iterated upon, meaning that we can traverse through all its values.
Technically, in Python, an iterator is an object that implements the iterator protocol, which consists of the methods:
- __next__()
- __iter__()
Iterable objects, such as lists, sets, tuples, strings, and dictionaries, can provide us with an iterator by using the iter() method and we can traverse through them with the next() method.
some_string = "hello" string_iterator = iter(some_string) some_list = [1,2,3,4,5] list_iterator = iter(some_list) # We can call next() to get the next value # in the container print(next(string_iterator)) print(next(list_iterator))
In the case we call next() after the end of the container we iterate upon a StopIteration exception will be raised.
In addition, note that what a for loop does is that create an iterator object and executes next() for each loop
Yield keyword
The yield keyword is only used when we want to define a generator function, in the body of the function. It pauses the execution of the function and sends a value back to the caller, but it maintains its state and can resume from the point it was left off.
def some_generator(): yield 1 yield 2 yield 3 # Simple for each lop for value in some_generator(): print(value) # Iterating with next() print(next(some_generator()))
The code above produces a series of values over time instead of computing everything at once like a list.
Overall we can consider yield a smart return that can preserve state and remember what it did the last time and continue from there.
Generator Functions
Generator functions are defined as normal functions. Although when it is needed to generate a value the yield keyword is used. If the yield keyword is contained in the body of a function then this function automatically is a generator function.
Generators are iterators, but we can iterate over only once. They generate their values on the fly instead of keeping them in memory. This is very useful for large data sets.
Assume we want to write a function that generates a Fibonacci sequence (series of numbers where each number is the sum of the two previous ones and the first two numbers are 0 and 1) which we do not know where it should end beforehand.
def fibonacci(): first,second = 0,1 while True: yield first first,second=second,first+second
This is a generator function with an infinite loop. It yields each time the value that is held in first variable and then it reassigns the values in both first and second variables.
for value in fibonacci(): if value > 100: break print(value, " ")
This is how we can call the generator function and print all the numbers of Fibonacci series that are less than 100.
Generator Expressions
We have seen already the difference between normal functions and generator functions. Another way to create generators is the generator expressions which are similar to list comprehensions.
list_comprehension_example = [n**2 for n in range(11)] generator_expression_example = (n**2 for n in range(11)) print(list_comprehension_example) # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100] print(generator_expression_example) # <generator object <genexpr> at 0x01413EA0> print(next(generator_expression_example)) # 0 print(next(generator_expression_example)) # 1 print(next(generator_expression_example)) # 4
Both the list_comprehension_example and generator_expression_example when they are created are performing the same task. The differences are:
- We surround the generator with parenthesis instead of brackets.
- Since generators produce one value at a time we cannot perform operations such as slice or indexing.
- As soon as we start getting values from the generator with the next() method we will not be able to use them anymore
Generators are an intimidating/complicated topic for several developers but if you invest some time to understand them they can be a very useful tool in your arsenal. There are a lot of topics that we did not cover here but hopefully, this can get you going on this subject.