In the last post about python iterables, we discussed what it means to be an iterable – being able to participate in the for loop and implementing the __iter__() method to create iterator objects. There is another related concept in python that takes this ability to participate in python for loops a bit further. The concept of being an iterator. This is very important because people often get confused about what it means to be a python iterable from being a python iterator.
In fact, you are basically enabling your object to participate in python for loops or to be used to retrieve a stream of data when you implement the __iter__() method (make an object an iterable) but that is not enough because as I showed you in the user defined class in the last post, you need to implement one more method, the __next__() method to complete the process. So why you need the __next__() method is because __iter__(), which makes your object an iterable, just returns an iterator object but implementing __next__() makes it possible for you to access the elements in the iterator object and defines that object as an iterator. So, with this we are ready to define what it means to be an iterator.
What it means to be a python iterator
To be a python iterator, an object just needs to implement the __next__() method. This method helps the object to remember its state when returning the next value in the iteration, update its state so that it can point to the next value, and signals when there are no more elements in the stream by raising the StopIteration exception. That is it. An iterator is just able to remember what it is doing while retrieving a stream of data.
Python recommends that any object that implements the __next__() method should also implement __iter__() method and when doing so return the object itself. So, this makes it that python iterators are also python iterables. Remember that fact because that is where many persons get confused. We covered this in the post on iterables.
In summary, iterators are like iterables that participate in for loops or in functions like map, zip etc which need iterables and remembers where it is when retrieving items from the object.
Now that we have a definition, let’s take examples. Several built-in datatypes support iteration like lists and dictionaries, so we will use them for examples.
See what happens when you call iter() (which invokes the __iter__() method) and then next() (which invokes the __next__()) on a dictionary object which we will use as our loop in python example.
As you can see from the code above, the dictionary looped through its keys when it was used as the argument to the next method.
You can do the same thing above with any native python iterable. They were built to act as iterators.
Python has made it that when you carry out a python for loop the process of calling iter(object) and next() is automated so that you really don’t realize what is happening under the hood.
You should note that once the StopIteration exception is raised for an object, it must continue to raise that exception on subsequent calls to the next method. This is because in memory what you have is an empty container or iterator. To make the object start all over again and return the stream, you need to call iter method afresh if it is a container object like a list or dictionary, but if not, there is nothing to do but to use a python generator. This occasion is why you often do not see python iterators being used often because python generators come in handy to help you when you need multiple passes to a non-container iterator object. We will discuss python generators in the next post because they are interesting python functions, so just watch out for it.
User defined python iterators
Iterators that you define yourself in code just need to implement __iter__() which produces an iterator object and __next__() which helps you to traverse the elements in the stream of data. That’s just that; what I have been saying all along. I touched on this in the iterables discussion. This is some code that could be used to produce a user defined iterator that is based on the list datatype.
As I said before, one deficiency of iterators is that they only support one pass. If you attempt a second pass at them, they behave like empty containers. You can try it out and see for yourself. Because of this limitation on having only one pass, when I want to access the items in an object as a stream, I just use them as an iterable using python for loop. But when I want to be able to generate values, I use a generator.
Some things you can do with iterators is to materialize the iterator object as a tuple, list etc, do sequence unpacking on them, or even use the max and min functions on them.
Happy pythoning.
No comments:
Post a Comment
Your comments here!