There was a lot of overhead in building an iterator in Python; we had to implement a class with__iter__() and __next__() method, keep track of internal states, raise StopIteration when there was no values to be returned etc. This is both lengthy and counter intuitive. Generator comes into rescue in such situations.

Python generators are a simple way of creating iterators. All the overhead that we mentioned above are automatically handled by generators in Python. Simply speaking, a generator is a function that returns an object (iterator) which we can iterate over (one value at a time).

Creating a Generator in Python

It is fairly simple to create a generator in Python. It is as easy as defining a normal function withyield statement instead of a return statement. If a function contains at least one yield statement (it may contain other yield or return statements), it becomes a generator function. Both yield andreturn will return some value from a function. The difference is that, while a return statement terminates a function entirely, yield statement pauses the function saving all its states and later continues from there on successive calls. Here is how a generator function differs from a normal function.

  • Generator function contains one or more yield statement.
  • When called, it returns an object (iterator) but does not start execution immediately.
  • Methods like __iter__() and __next__() are implemented automatically. So we can iterate through the items using next().
  • Once the function yields, the function is paused and the control is transferred to the caller.
  • Local variables and theirs states are remembered between successive calls.
  • Finally, when the function terminates, StopIteration is raised automatically on further calls.

Here is an example to illustrate all of the points stated above. We have a generator function namedmy_gen() with several yield statements.

An interactive run in the interpreter is given below.

One interesting thing to note in the above example is that, the value of variable n is remembered between each call. Unlike normal functions, the local variables are not destroyed when the function yields. Furthermore, the generator object can be iterated only once. To restart the process we need to create another generator object using something like a = my_gen().

One final thing to note is that we can use generators with for loops directly. This is because, a forloop takes an iterator and iterates over it using next() function. It automatically ends whenStopIteration is raised. Check here to know how a for loop is actually implemented in Python.

Python Generators with a Loop

The above example is of less use and we studied it just to get an idea of what was happening in the background. Normally, generator functions are implemented with a loop having a suitable terminating condition. Let’s take an example of a generator that reverses a string.

In this example, we use range() function to get the index in reverse order. Here is a call to this function.

It turns out that this generator function not only works with string, but also with other kind of iterables like list, tuple etc.

Python Generator Expression

Simple generators can be easily created on the fly using generator expressions. It makes building generators easy. Same as lambda function creates an anonymous function, generator expression creates an anonymous generator function. The syntax for generator expression is similar to that of alist comprehension in Python. But the square brackets are replaced with round parentheses.

The major difference between a list comprehension and a generator expression is that while list comprehension produces the entire list, generator expression produces one item at a time. They are kind of lazy, producing items only when asked for. For this reason, a generator expression is much more memory efficient than an equivalent list comprehension.

We can see above that the generator expression did not produce the required result immediately. Instead, it returned a generator object with produces items on demand.

Generator expression can be used inside functions. When used in such a way, the round parentheses can be dropped.

Why generators are used in Python?

There are several reasons which make generators an attractive implementation to go for.

  1. Easy to ImplementGenerators can be implemented in a clear and concise way as compared to their iterator class counterpart. Following is an example to implement a sequence of power of 2’s using iterator class.

This was lengthy. Now lets do the same using a generator function.

  1. Since, generators keep track of details automatically, it was concise and much cleaner in implementation.
  2. Memory EfficientA normal function to return a sequence will make the entire sequence in memory before returning the result. This is an overkill if the number of items in the sequence is very large. Generator implementation of such sequence is memory friendly and is preferred since it only produces one item at a time.
  3. Represent Infinite StreamGenerators are excellent medium to represent an infinite stream of data. Infinite streams cannot be stored in memory and since generators produce only one item at a time, it can represent infinite stream of data. The following example can generate all the even numbers (at least in theory).
  4. Pipelining GeneratorsGenerators can be used to pipeline a series of operations. This is best illustrated using an example.Suppose we have a log file from a famous fast food chain. The log file has a column (4th column) that keeps track of the number of pizza sold every hour and we want to sum it to find the total pizzas sold in 5 years. Assume everything is in string and numbers that are not available are marked as ‘N/A’. A generator implementation of this could be as follows.

This pipelining is efficient and easy to read (and yes, a lot cooler!).

Leave a reply

Please enter your comment!
Please enter your name here