Search

Showing posts with label abstraction. Show all posts
Showing posts with label abstraction. Show all posts

Python Shallow Copy And Deep Copy

Sometimes while programming, in order to prevent having side effects when we want to change an object, we need to create a copy of that object and mutate the copy so that we can later use the original. Python provides methods that we can use to do this. In this post, I will describe the shallow copy and deep copy methods of python that you can effectively use to copy objects even recursively.

python shallow copy and deep copy

 

Many programmers think that the assignment operator makes a copy of an object. It is really deceptive. When you write code like this:

object2 = object1

You are not copying but aliasing. That is, object2 is getting a reference to the objects which serve as the value of object1. Aliasing could seem intuitive to use, but the caveat there is that if you change the value of any one of the aliased objects, all the objects referencing that value also change. Let’s take an example.

You could see that I made an aliasing between second_names and names in line 2 so that they both reference the same object. When I appended a name to second_names, it reflected in names because they are both referencing the same object.

Sometimes, we don’t want this behavior. We want the fact that when we have made a copy, we have made a copy that is independent from the original copy. That is where python shallow copy and deep copy operations come in. To make this work, we need to import the methods from the copy module: import copy.

How Python shallow copy works with examples

The syntax for python shallow copy is copy.copy(x). The x in the argument is the original iterable you want to copy. I need to state here that the iterable that needs to be copied must be mutable. Immutable iterables are not copied.

Let’s take an example of how python shallow copy works on a list.

You can see that the copy, second_names, remained unchanged even after we added items to the original.

Let’s take an example on how python shallow copy works on a dictionary.

You can see in the dictionary also that the python shallow copy function operates on a dictionary as we expected.

You can also do copy on sets; they are mutable iterables. If you want a refresher on iterables, you can check this post.

There is one weakness of python shallow copy. As the name implies, it does not copy deep down. It copies only items at the surface. If in a list or dictionary we have nested items, it will reference them like in the aliasing operation rather than copy them.

Let’s use an example to show this.

Now, you can see that we changed Rose’s grade in the original from ‘C’ to an ‘A’ but the change was reflected in the copy. Too bad! That is behavior we don’t want. This is because python shallow copy does not go deep down or does not copy recursively. We need another type of copy to make both lists or dictionaries independent. That is where python deep copy comes in.

How python deep copy works

Python deep copy will create a new object from the original object and recursively will add all the objects found in the original object to the new object without passing them by reference. That’s cool. That makes our new nested objects copy effectively.

The syntax for deep copy is copy.deepcopy(x[, memo]). The x in the argument is the original object that has to be copied while memo is a dictionary that keeps a tab on what has already been copied. The memo is very useful in order to avoid recursive copying loops or for deep copy not to copy too much. I find myself using the memo often when I am implementing deep copy in my custom classes.

Now, let’s take an example of python deep copy on a list, a nested list precisely, and see how it performs.

You can see now that the original nested list was changed without affecting the copy.

That goes to show you the power of python as a programming language.

We can take this concept further by showing how to implement shallow copy and deep copy in python using custom classes. All you really need to do is implement two special methods in your classes for whichever you need. If you need to use python shallow copy in your class, implement the __copy__() special method and if you need to use python deep copy, just implement the __deepcopy__() special method.

Let’s show this by examples.

In the code above we defined a Student class with each student having a name, grade and dept. Then we defined a Faculty class that aggregates a list of students. Then in the Faculty class we implemented the __deepcopy__() special method in order to be able to recursively copy the list of students. Finally in the driver codes, lines 25 to 37, we created the objects for the classes and then copied the faculty object to a new faculty object to see how it will run, printing out the students in the new faculty object.

That’s cool. Just love this code. I hope you enjoyed yourself. I would love to receive feedback as comments.

Happy pythoning.

Python Print() Function: How it works

One of the ubiquitous and most often used functions in python is the python print function. We use it for realizing the output of our codes and even for debugging. So, it is pertinent that we understand how it works.

python print

 

In its essential form what the python print function does is to take a given object, convert it to a string object and print the value out to the standard output, or what is called the screen. It can even send the output to a file.

The python print syntax

The python print function despite its wide ranging value has a simple syntax. The syntax of the python print statement is print(*objects, sep=' ', end='\n', file=sys.stdout, flush=False). I will be explaining each of the arguments in this post. So, just take note of the syntax.

Usually when you want to print something to the screen you provide the python print function with an object or several object arguments. If you don’t specify other parameters, what the function does is print each of the arguments to the screen, each separated by empty space and after all the arguments are printed, to go to a new line. Let’s illustrate this with an example and explain how it relates to the syntax.

When you run the code above, you will see that it nicely prints out each of the objects to the python print function. Here is what happened. I passed it 5 objects and it prints out the five objects each separated by a space. The separation by a space comes about from the sep parameter in the syntax above. The sep means separator. By default its value is a space. Notice that I cast one of the objects to a string before printing it out. This is a trick to make the period adhere to the value of the string. Very cool. We can change the value of the separator. I will highlight it in the separator section below.

Now what happens if we print without passing an object. Let me give an example following from our example above.

You can see that I repeated the earlier code. But on line five I wrote a print statement without giving it any argument or object. If you look at the output on the screen, you will see that it translated it into an empty space. Yes, without any argument the python print function just looks at what is at the end parameter and since the default is a newline, ‘\n’, it creates a new line.

Now let’s see how we can customize the working of the python print function using the keyword parameters outlined in the syntax.

Customizing python print with the sep keyword

The sep keyword separates each of the objects in the python print function based on its value. The default is a whitespace character. That means if you use the default, as outlined above, each of the objects when printed out will be separated by a whitespace.

What if we want another separator on python print, like we want a colon, :, to separate each of the objects to be printed. Here is code that could do it.

If you watch the output to the screen, you could see that each of the objects that was passed to the python print function now has a colon separator between them.

You could create any separator of your imagination. Most times when I have specific ways to print an output it could call for my customizing the separator.

Customizing python print with the end keyword.

The end keyword is another parameter that we could use to customize the python print function. As I highlighted above where I printed a print function without objects, the default for the end keyword is a newline, ‘\n’, which creates a new line after printing the objects. That means python print adds a newline to each line. Most times when I want python print without newlines, that means, subsequent lines of objects to print on the same line, I customize the end keyword. You just replace the default with a space character, ‘ ‘, which signifies to concatenate all the subsequent lines on one single line.

For example, you have code you want printed in the same line. Here is the code that could do it.

You can see that by customizing the end parameter to a space, I have made all the objects in the python print function print without newline to the same line.

How to print to file using file keyword

Most usually when you call the python print function, it prints to standard output, that is, the screen. That is the default. I will show you how to print to a file. You can customize it to print to a file by specifying a file object as the value to the file parameter which file object should be writable. For details on how to open, read, and make files writable, see this blog post.

Now, let’s take an example. This time instead of printing to the screen we will be printing to a file or writing to a file. Here is the code:

    
text = 'I feel cool using python.
        \nIt is the best programming language'
with open('new_file.txt', 'w') as source_file:
    print(text, file=source_file)

You can run it on your machine. When you do, rather than getting the text message to your screen, it will print to the file, new_file.txt. If new_file.txt doesn’t exist, it will create one.

One thing to note about file objects passed to the file keyword – you cannot use binary mode file objects. This is because python print function converts all its objects to the str class (strings) before passing them to the file. So, note this and if you want to write to binary mode file objects, use the write methods that are built-in for file objects.

You must really be feeling empowered with all the cool features in python print function. I am. You can subscribe to my blog or leave a comment below. I feel happy when I believe I have made an impact.

Happy pythoning.

Simulating A Random Walk In Python

Deterministic processes are what every programmer is familiar with when starting out in their journey. In fact, most beginner books on programmer will teach you deterministic processes. These are processes where for an input, you always get the same output. But when you get into industry, you find out that most times stochastic processes are the norm when finding solutions to problems. Stochastic processes give different results for the same input.

random walk python

 

In this post, I will be simulating a stochastic process, a drunkard’s walk, which is an example of a random walk.

Python random walks are interesting simulation models for the following reasons:

  1. They are widely used in industry and interesting to study.
  2. They show us how simulation works in practice and can be used to demonstrate how to structure abstract data types.
  3. They usually involve producing plots which are interesting and as they say, a picture is worth a thousand words.

So, let’s go to the simulation exercise. It is interesting to find out how much distance a drunk would have made from his starting position if he takes a number of steps within a given space of time. Would he have moved farther after that time, would he still be close to the origin, or where would the drunk be? Such questions can only be simulated for us to get a general idea of the drunk’s position. We’ll imagine that for each movement, the drunk can take one step either in the north, south, east, or west direction. That means he has four choices to choose from for each step.

To model the drunk’s walk after some time, we will be using three classes representing objects that define his position relative to the origin: Location, Field, and Drunk classes.

The Location class defines his location relative to the origin. We could write code for the class this way:

    
class Location(object):

    def __init__(self, x, y):
        ''' x and y are numbers '''
        self.x, self.y = x, y

    def move(self, delta_x, delta_y):
        ''' delta_x and delta_y are numbers '''
        return Location(self.x + delta_x, self.y + delta_y)

    def get_x(self):
        return self.x

    def get_y(self):
        return self.y

    def dist_from(self, other):
        ox, oy = other.x, other.y
        x_dist, y_dist = self.x - ox, self.y - oy
        return (x_dist**2 + y_dist**2)**0.5

    def __str__(self):
        return '<' + str(self.x) + ', ' + str(self.y) + '>'

Each location has an x and y coordinate representing the x and y-axis. When the drunk moves and changes his location, we could return a new Location object to signify this. Also using the location class, we can calculate the distance of the drunk from another location, and most possibly the origin.

The second class we need to define is the Field class. This class will allow us to add multiple drunks to the same location. It is a mapping of drunks to their locations. Code could be written this way for it:

    
class Field(object):

    def __init__(self):
        self.drunks = {}

    def add_drunk(self, drunk, loc):
        if drunk in self.drunks:
            raise ValueError('Duplicate drunk')
        else:
            self.drunks[drunk] = loc

    def move_drunk(self, drunk):
        if drunk not in self.drunks:
            raise ValueError('Drunk not in field')
        x_dist, y_dist = drunk.take_step()
        current_location = self.drunks[drunk]
        # use move method of Location to get new location
        self.drunks[drunk] = 
                  current_location.move(x_dist, y_dist)

    def get_loc(self, drunk):
        if drunk not in self.drunks:
            raise ValueError('Drunk not in field')
        return self.drunks[drunk]            

As you can see, the Field class is a mapping of drunks to locations. When we move a drunk, his location reflects this move and we take note of the current location. Also, we can use this class to find out the location of any drunk.

The last class of interest is the Drunk class. The Drunk class embodies all the drunks we will be playing with. It is a common class or parent class as all other drunks will inherit from this class.

    
import random

class Drunk(object):

    def __init__(self, name=None):
        ''' Assumes name is a string '''
        self.name = name

    def __str__(self):
        if self != None:
            return self.name
        return 'Anonymous'

What the Drunk class does is give identity to each drunk object or subclass.

Now, we will create a drunk with our expected way of movement: that is take one step each time in the north, south, east, or west direction. We will call this drunk class, UsualDrunk. Here is the definition of the class.

    
class UsualDrunk(Drunk):

    def take_step(self):
        step_choices = [(0,1), (0, -1), (1,0), (-1,0)]
        return random.choice(step_choices)

The UsualDrunk class inherits from the Drunk class and the only method it defines is the random step it can take. From the take_step method you can see that it can only move one step to the east, west, north or south, and this in a randomized fashion.

So, now that we have our classes let us try to answer the question – where will the drunk be after taking a series of walks in a random fashion? Like taking 10 walks, or 100, or 1000? Normally, we would expect that when the number of walks increases, the distance from the origin should increase. But this might not be the case because you know how drunks walk – haphazardly. Some drunks can even retrace their steps back to where they started and go nowhere!

So, for our simulation, we will write code that makes use of these classes and run the code on the drunk taking a number of steps with different trials for each step. We are using different trials in order to balance out the randomized walk and get a mean of distances.

Here is the code:

When you run it there is one fact that stands out: The mean distance from the origin increases as the number of steps increases. That is the hypothesis we started with.

Some pertinent new driver code are the following:

    
def walk(f, d, num_steps):
    '''Assumes: f a field, d a drunk in f, 
    and num_steps an int >= 0.
    Moves d num_steps times; returns the distance between
    the final location and the location at the start 
    of the walk.'''
    start = f.get_loc(d)
    for _ in range(num_steps):
        f.move_drunk(d)
    return start.dist_from(f.get_loc(d))

The walk function returns the distance from the final location for a single trial based on the drunk taking a number of steps that is defined.

    
def sim_walks(num_steps, num_trials, d_class):
    '''Assumes num_steps an int >= 0, num_trials an int > 0,
    d_class a subclass of Drunk. 
    Simulates num_trials walks of num_steps steps each. 
    Returns a list of the final distance for each trial'''
    homer = d_class()
    origin = Location(0,0)
    distance = []
    for _ in range(num_trials):
        f = Field()
        f.add_drunk(homer, origin)
        distance.append(round(walk(f, homer, num_steps), 1))
    return distance

The sim_walks function (simulated walks) is different from the walk function only in one aspect: it relates to all the different trials that are used for a specific step. Say for a 10 steps walk we did 100 trails so as to get the mean. So sim_walk returns a list of the distances for the trials. This is so that we can take the mean distance for each number of steps since we are randomizing the walk.

And finally, the drunk_test function.

    
def drunk_test(walk_lengths, num_trials, d_class):
    '''Assumes walk_lengths a sequence of ints >= 0
    num_trials an int > 0, d_class a subclass of Drunk
    for each number of steps in walk_lengths, runs sim_walk
    with num_trials walks and prints results '''
    for num_steps in walk_lengths:
        distances = sim_walks(num_steps, num_trials, d_class)
        print(d_class.__name__, 'random walk of ', num_steps, 'steps')
        print('Mean:', round(sum(distances)/len(distances), 4))
        print('Max:', max(distances), 'Min:', min(distances))

This serves as the test of our code. It prints out the mean for each number of steps after doing the various trials and then the max and min for those trials in a specific step.

You could download the above code here, random_walk.py.

But a picture is worth a thousand words. Let us use a plotted graph to illustrate the variation in the number of steps to the distance from the origin.

drunkards walk python

 

You can see from the graph above that when the drunk is taking ten steps for each of the 100 trials, the distance he moves is closer to the origin than when he takes 100 or 1000 steps. But the drunk seems more determined to walk farther away if he is given the opportunity to take several steps. Drunks really mean to get home it seems! A graph of number of steps for each trial to mean distances shows that this is truly the case: the more opportunity he is given to take higher steps, the closer he gets to home and away from where he started with. The graph below shows that information.

drunkards walk python


The scales in the graph have been extrapolated to logarithmic scales to clearly show the straight line relationship between number of steps and mean distance from the starting point. To see how the code for the plotted graphs were written you can download it here, random_walk_mpl.py.

Now, our simulation has dwelt on a drunk walking the way we expect: for each step one unit towards the east, west, north, or south.

What if we could make the drunkard’s walk somewhat biased by skewing it a little. That would involve creating different drunks with different steps and comparing them to our usual drunk.

A biased random walk simulation.

Let’s imagine a drunkard who hates the cold and moves twice as fast in a southward direction. We could make him a subclass of Drunk class and change his way of movement in the class, calling him ColdDrunk.

This could be his class definition:

    
class ColdDrunk(Drunk):
    def take_step(self):
        step_choices = [(0.0, 1.0), (0.0, -2.0), 
                       (1.0, 0.0), (-1.0, 0.0)]
        return random.choice(step_choices)

You can see that whenever he moves southwards, y axis, he takes two times a unit step.

Now let’s also add another hypothetical drunk that moves only in the east-west direction. He really moves with the sun or is phototrophic. We could define his class, EWDrunk, in the following way:

    
class EWDrunk(Drunk):
    def take_step(self):
        step_choices = [(1.0, 0.0), (-1.0, 0.0)]
        return random.choice(step_choices)

So, we have all our drunks ready. Now let’s write code that will run them and compare their mean distances for various number of steps.

If you run the code above you will get a plotted graph that shows number of steps against mean distance from the origin for the three drunks. You will get a graph that looks like the following:

drunkards walk python


You will notice that for both the UsualDrunk, who we highlighted earlier, and the phototrophic drunk, EWDrunk, their variation in mean distance as the number of steps increases is not much compared to the South loving or north hating drunk, ColdDrunk. That means the ColdDrunk, or north hating drunk, is moving faster than all other drunks. This is not surprising based on the fact that whenever he moves south, he moves twice as fast. That means randomly the drunk’s movement is more favorable than the other two.

We could extrapolate on this conjecture and build a scatter plot of the location of each drunk’s movement for each step but I think the point has already been made: simulating a random walk could give us insights into a model and could confirm or deny a hypothesis.

If you would like a copy of the code for the three drunks, you can download it here, random_walk_biased.py.

That’s it folks. I hope you enjoyed this post. I really enjoyed coding it. It was fun.

This helps us to see the insight that plotting a class or set of classes can give to a programmer.

Happy pythoning.

Object Oriented Programming (OOP) in Python: Polymorphism Part 3

Polymorphism is a concept you will often find yourself implementing in code, especially in python. Polymorphism means essentially taking different forms. For example, a function implementing polymorphism might accept different types and not just a specific type.

oop in python polymorphism

 

There are different ways to show polymorphism in python. I will enumerate some here.

1. Python polymorphism with functions and objects.

In this implementation of polymorphism, a function can take objects belonging to different classes and carry out operations on them without any regard for the classes. One common example of this is the python len function, or what is called the python length function. You can read my blog post on the python length function here.

Here is an example of how the python length function implements polymorphism.

You can see from the above code that the python len function can take a str object as argument as well as a list object as argument. These two objects belong to different classes.

Now, not only can built-in functions exhibit polymorphism, we can use it in our custom classes.

I believe you have followed through with the earlier posts on python OOP so the example classes are self-explanatory. If not, you can get the links at the end of this post. I will just dwell on the print_talk function. You can see that we invoked the function in the for loop after creating the dog and person objects. Each time the print_talk function is invoked, different objects are passed to it and it successfully invoked their talk methods. Different objects but same function and different behavior.

2. Polymorphism with inheritance

Polymorphism with inheritance has been discussed in the python OOP inheritance post. I will just briefly highlight it here. It involves a method of the child class overriding the methods of the parent class. The code above shows it but let me point it out again. The method of the child class has the same name and arguments as the method of the parent class but the implementation of the method is different.

    
class Animal:
    
    type = 'Animal'

    def __init__(self, name):
        ''' name is a string '''
        self.name = name

    def talk(self):
        print(f'My name is {self.name} and I can talk')

    def walk(self):
        print(f'My name is {self.name} and I can walk')        

class Dog(Animal):

    legs = 4

    def __init__(self, name, bark):
        Animal.__init__(self, name)
        self.bark = bark 

    def talk(self):
        print(f'My name is {self.name} and 
                    I can bark: "{self.bark}"')

    def walk(self):
        print(f'My name is {self.name} and 
                   I walk with {self.legs} legs')

From, the code above you can see that the Dog class inherits from the Animal class. The Dog class overrides the talk and walk methods of the Animal class and implements them differently. That is what polymorphism in inheritance is all about.

As your skill with python increases, you will find yourself implementing polymorphism in a lot of instances. It is a feature of python that is commonplace.

I hope you enjoyed my three part series on how OOP is implemented in python. The first part was on OOP in python classes and objects, the second part was on OOP in python class inheritance and this is the third part. I hope you have learned something new about python and will be determined to implement these features in your coding.

Happy pythoning.

Object Oriented Programming (OOP) in Python: Inheritance. Part 2

Inheritance as a concept refers to the ability of python child classes to acquire the data attributes and methods of python parent classes. Inheritance occurs many times in programming. We might find two objects that are related but one of the objects has a functionality that is specialized. What we might do is take the common attributes and functions and put them in a class and then take the special attributes and functions and put them in another class, then the objects with the special functions inherit the common attributes and functions.

oop python inheritance

 

For example we have persons and dogs. We know that all persons and dogs are animals with names and they can walk but dogs bark and persons speak words. So the commonality here is being animals. We can create a different animal class for the common features and then create separate dog and person classes for the special features.

The python class which inherits from another python class is called the python child class or derived class while the class from which other classes inherit is called the python parent class or base class.

The syntax for a python class inheritance from a parent class is as stated below:

    
class ChildClassName(ParentClassName):

    statement 1

    statement 2

Let’s take an example from the dog and person objects above. We could create a class for dog objects and another class for person objects and then make the dog and person objects inherit from the animal class since that is their common features. The code could run like the one below:

You can see from the code below that the python child classes, Dog and Person, call the python parent class constructor, Animal.__init__() to initialize their names because name is a common feature for both of them. But Dog and Person have special features like bark and words that they use to talk differently. Also, notice that there is a class variable, legs, for Dog and Person that have different values. This is to emphasize their different legs and these class variables apply to all their objects.

When a derived class definition is executed, it is executed the same way as the parent class and when the python child class is constructed through the __init__() method, the parent class is also remembered. When attribute references are being resolved, the parent class will also be included in the hierarchy of classes to check for such attributes. Just like the self.name calls we had for Dog and Person objects above. When the attribute is not found in the child class, it searches for it in the parent class. The same process occurs for method references. Notice that when I called bingo.talk() and Michael.talk() above, the program first searched the class definitions of the objects to find if there was a talk method defined therein. If there were none, it would have gone on to search the parent classes until the specified method is found.

Another python class inheritance feature you have to notice is that the python child classes in this example override the methods of the python parent classes. This feature is allowed in python class inheritance mechanism. In the example above, the Animal class has a talk and walk method but the child classes also implement a talk and walk method, overriding the talk and walk methods of the parent classes.

You can access the python parent class attributes and functions in a python child class by calling parentclassname.attribute or parentclassname.function. Try it out yourself in code and see. That is what python class inheritance is all about. Child classes own the artefacts of their parents.

Also, another feature you have to notice is that overriding methods may not just want to override the parent methods but they want to extend the parent methods by adding more functionality to what the parent can do. Let me give an example of how python extends classes in inheritance.

You can see that the worker class defines a worker by name and the company he works in. The CEO class inherits from the worker class and also defines the company he works in. But in the works() method, the CEO class calls the works method of Worker using Worker.works(self) and then extends it by printing out that the worker CEO also owns the company. So, you can see how one can extend a method from a child class. You call the method of the parent and add further functionality.

Python functions used to check python class inheritance.

Python has two built-in functions that you can use to check for python class inheritance in your objects. They are isinstance() and issubclass().

isinstance(): The syntax is isinstance(object, classinfo) and it checks if object is an instance of classinfo. That is, it checks for the type of the object. It returns True if object is an instance of classinfo and false otherwise. Examine the code below and run it to see how it works for our inheriting classes.

You will notice that since a dog is a childclass of an animal, it is also an instance of an animal, or it is of type Animal. So, a dog object is an instance (type) of Dog and also an instance (type) of Animal. But a dog is not an instance of a person because it is not inheriting anything from class Person. Please, note these differences.

issubclass(): This function is used to check for python class inheritance. The syntax is issubclass(class, classinfo) where class and classinfo are classes. The function returns True if class is a subclass of classinfo but False otherwise. Let’s use our example classes to check for inheritance.

You will notice that since Dog and Person classes are inheriting from Animal class, they are subclasses of the Animal class but Dog or Person classes are not subclasses because there is no inheritance relationship.

Types of python class inheritance: multilevel inheritance and multiple inheritance.

Python multilevel inheritance: Here we inherit the classes at separate multiple levels. C inherits from B and B inherits from A. For example, consider an example where Person inherits from Animal and Student inherits from Person. If you use isinstance() and issubclass() functions, you can see that Student by this multilevel class inheritance mechanism acquires the data attributes and methods of Animal. Let’s show this with example.

You can see that the Student has a name attribute that is inherited from Animal even though it is directly inheriting from Person, this is because in the inheritance tree it is also inheriting from animal. We show this when we call isinstance() method at the last line.

This is an example of multilevel inheritance.

Python multiple inheritance: In multiple inheritance, a python child class can inherit from more than one python parent class. The syntax for multiple inheritance is:

    
class ChildClass(ParentClass1, ParentClass2, etc):

    statement 1

    statement 2
    

The way python works is that when the object of the child class is searching for an attribute it first looks for it in its own name space, and when not found in that of parentclass1 and in all the parent classes of that class and if it is not found, it then goes to parentclass2 and so on in the order they are written.

Benefits of using python class Inheritance.

Inheritance as an OOP concept has many advantages that it gives to the programmer.

1. It makes the programmer to write less code and to avoid repeating himself. Any code that is written in the parent class is automatically available to the child classes.

2. It also makes for more structured code. When code is divided into classes, the structure of the software is better as each class represents a separate functionality.

3. The code is more scalable.

You can check out my other posts about OOP concepts in python like that about classes and objects, as well as that on python polymorphism.

Happy pythoning.

Object Oriented Programming (OOP) In Python: Classes and Objects Part 1

Computer scientists are continually refining how they write programs. Therefore, they have resorted to several methodologies to do so. Object Oriented Programming (OOP) is a popular methodology and it is the methodology that python relies on. Other programming languages that rely on the OOP methodology include Java and C++.

oop in python class and object

 

Object oriented programming as the name implies relies on objects as its main concept, and along with other related features of objects like classes, inheritance, abstraction and encapsulation. This is a wide departure from functional programming which depend on functions as its main concept. In OOP, every operation is defined as an object and every attribute is based on that owned by the object.

Object oriented programming has become popular because it brings programming close to real life, to the things people could associate with, and not to mathematical functions that are most times the province of professional scientists and mathematicians.

In python, we will start by describing how python implements classes and objects in OOP before we relate it to other features of OOP in python.

Python Class in OOP.

A class is a blueprint for defining data and behavior of objects. It helps us to define similar objects and relate them to a class. For example if you have two dogs, one called “James” and another called “Bingo”, these dogs are both objects with similar behavior and properties. Therefore, we could create a Dog class for both objects.

When we create a python class, we are bringing together the data and behavior of an object and defining them in a bundle. Therefore, creating a new python class is the same thing as creating a new type of object in python and with the ability that new instances of that type can then be made. For example, if we create a Dog class from the example above, new instances of dogs named ‘James’ and ‘Bingo’ can then be made. These class instances are given the attributes defined in the class to maintain their state, and they can also have methods defined by the class for modifying their state.

It is through the python class definition that we can then implement all the features of object oriented programming in python such as class inheritance allowing for multiple child classes, the overriding of the methods of a parent class by a child class, and a child class having methods that have the same name as the parent class. Note that the amount and kinds of data that objects can contain that are derived from a class is unlimited. Just like modules, classes in python are dynamic in nature since they are created at runtime and can be modified after creation.

The following is the syntax of a python class definition:

    
class ClassName:

    statement 1
    statement 2
    

To define a python class you need to use the keyword class to precede the name of the class. Then the name of the class is followed by a colon. The colon delimits the block of statements that represents what goes into a class like the attributes and methods that the class defines for its objects.

Before a python class definition can have any effect, it must be executed like function definitions. The moment you call a python class, you are creating an object, called class instantiation. You can create and call a class this way:

    
# class definition
class Dog:
    
    def method1():
        statement 1...

# class execution. Creates an object
james = Dog()
        
        

When a class is called, a new namespace is formed which includes all the attributes and methods of the class.

Most times when you are instantiating an object or creating an object, you define the instantiation special method, __init__(). This special method contains all the attributes of the object instances that are needed when objects are created. Therefore, when this exists, the moment you invoke the class by creating an object, the object creation process automatically calls the __init__() special method and implements any statement that are contained within the special method.

For example, let’s take a class Animal that specifies that whenever an Animal object is created, it has to be given a name that would be bound to the object. We could write the code with the __init__() special method this way.

    
# class definition
class Animal:
    
    def __init__(self, name):
        ''' name is a string '''
        self.name = name

# class execution. Creates an object
james = Animal('James')

With the code above any Animal object that is created will be supplied a name. Here we gave the animal the name, James. That name is bound to the python object, james, throughout its lifetime and is unique to it.

Python classes also contain data attributes and methods. The data attributes are of two types: data attributes that are generalized for the class and is applicable to all objects (called class variables) and data attributes that are specific to each instance of a class or each object and called instance variables. I will show how class variables are distinguished from instance variables in another section. But note that the ‘name’ attribute for our Animal class above is an instance variable because it pertains to each specific object of the class.

Python class methods are operations that will be performed by the objects of the class. A method is a function that “belongs to” an object. We add the self parameter when defining methods in a class. This tells the python interpreter that the method belongs to the class that called it. But this need not be enforced although many editors will tell you there was an error if you do not insert the self parameter. Let us illustrate an example of an animal that walks and talks below.

Notice that every method of the class has self as the first parameter. The methods walk and talk define what each object of the animal can do.

Python Objects and OOP.

Objects are the physical entities of the world. They serve as a realization of classes which are the logical entitites. An object can be anything – a book, a student, an animal etc. Objects are defined by three important characteristics: (A). An identity or a piece of information that can be used to identify the object e.g name, number. (B). Properties. The attributes of the object. (C). Behavior. These refers to the operations that are done on the object or the functions it can perform. E.g a student can write, a car can move, an animal can walk.

Objects of a python class support two types of operations: attribute reference and instantiation. I have outlined these two operations in the embedded code above but will bring them out again for emphasis.

In python, the standard syntax for attribute references is the dot syntax. In the above code, when referring to the method walk and talk, we used objectname.walk() and objectname.talk(). Also, in the walk and talk methods when referring to the name data attribute we used self.name. Valid attribute names are all the names that are in the class’s namespace when the python object was created following from the class definition.

Class instantiation that creates a python object uses function notation. Class instantiation results in the creation of an object for the class. We denoted class instantiation above with the code: james = Animal(‘James’) where Animal refers to the class Animal. Animal here is being used as a function to create the object james. The class instantiation function can have an argument or no argument based on how you defined your __init__() method. In our __init__() function above we specified that on object creation we need to supply a name for the object.

Python Class and instance variables in OOP.

As we said above, class variables refer to data and methods that are shared by all instances of the class, while instance variables refer to data and methods that are specific and unique to each instance of a class. If you want an attribute to be a class variable, you should not initialize it in the __init__() method but if you want it to be an instance variable you should initialize it in the __init__() special method. Let’s denote these with examples below.

From the example above you can see that we created two Animal objects, james and frog. In the class definition we defined the type attribute outside the __init__() method and therefore when we call it for both objects we have the same value or reply. But we defined the name attribute inside the __init__() method and then when we called the name attribute we received different values for both objects. Always remember this difference between class variables and instance variables in your code so you don’t get code that doesn’t work as you expect when creating classes and objects.

Data Hiding in Python.

In other object oriented programming languages like java they give you the ability to hide data from users getting access to them from outside the class. In this way, they make data private to the class. The makers of python do not want any data hiding in the language implementation. They state that they want everything in python to be transparent. But there are occasions where you can implement data hiding. This can be achieved by prefixing the attribute with a double underscore, __. When this is done you cannot directly access the attribute outside the class.

For example, let’s make the type attribute hidden in the Animal class.

You can see now that to reference the type attribute we get an AttributeError. But there is a workaround that we can use to get the attribute. When you call objectname._Classname__attributename you can get back that attribute. So nothing is hidden in python. Let’s show this with an example. Take note of line 16 in the code below.

It is beneficial to understand how python implements OOP extensively because when you are working in python, you not only use the built in types but have to create your own types. I hope this post helped you along that line. The subsequent posts will treat on other OOP concepts like python class inheritance in OOP and OOP in python - polymorphism that will build on this knowledge. I hope you also enjoy them.

Happy pythoning.

The 0/1 Knapsack Problem with Dynamic Programming In Python Simplified

This post is related to two previous post. First, the original post on the 0/1 knapsack problem where I explained how this can be solved using a greedy algorithm that doesn’t promise to be optimal and using a brute force algorithm that makes use of the python combinations function. After going through that post, a reader asked me to explore solving the knapsack problem using dynamic programming because the brute force algorithm might not scale to larger number of items. In the other related post which was on dynamic programming, I highlight the main features of dynamic programming using a Fibonacci sequence as example. Now, drawing on these two related posts, I want to show how one can code the knapsack problem with dynamic programming in python that can scale to very large values.

knapsack problem dynamic programming python

 

But first, you have to realize that this problem is good for a dynamic programming approach because it has both overlapping sub-problems and also optimal substructures. That is the reason why memoization would be easily applied to it.

But first, let’s start with analyzing the nature of the problem. If you read the link on the knapsack problem, I said that a burglar has a knapsack with a fixed weight and he is trying to choose what items he can steal from several items such that he would optimize on the weight of the knapsack. Now, to show you how we can do this with dynamic programming in python, I will diverge a little from the example I gave in the former post. Now, I would use a much simpler example in order to show the graphs that would help us do this dynamically.

Let’s say the burglar has 4 items to choose from: a, b, c, and d with values and weights as outlined below.

Name Value Weight
A 6 3
B 7 3
C 8 2
D 9 5

To model the choice process for the burglar, we will use an acyclic binary tree. An acyclic binary tree is a binary tree without cycles and it has a root node with each child node having at most two children nodes. The end nodes of the trees are known as leaf nodes and they have no children. For our binary tree, we will denote each branch of a node as the decision to pick an item or not to pick an item. The root node contains all the items with none picked. Then the levels of the root nodes denote the decision to pick an item, starting from the first ‘a’. The left branch denotes the decision to pick an item and the right branch the decision not to pick an item.

We create each decision node with four sets of features or a quadruple in this order:

1. The set of items that are taken for that level

2. The list of items for which a decision has not yet been made

3. The total value of the items in the set of items taken

4. The available weight or remaining space in the knapsack.

Modeling this in a binary tree we will get a tree looking like this. Each node is numbered.

binary tree python


You can see from the above diagram that each node has the four features outlined above in order. For Node 0, the root, the set of items that are taken are the empty set, the list of items to make a decision is all the items, the total value of items taken is 0, and the remaining space in the knapsack is 5. At the next level, when we choose item ‘a’, the quadruple changes in node 1. We chose the left branch, node 1, because the weight of item ‘a’, 3, is less than the remaining space which is 5. Choosing the left branch, item ‘a’, then reduces the available space to 2, and changes the other items for that node. The right branch to the root node is node 6 which involves the decision not to choose item ‘a’. If we decide not to choose item ‘a’ but remove it from the list, then we see that the set of items taken remain empty, the list of items has reduced to only [b,c,d], the total value of items taken is still 0 and the remaining space is still 5. We do this decision structure, making the decision to choose or not to choose an item and updating the node accordingly until we get to a leaf node.

Characteristics of a leaf node in our graph: a leaf node is a node that either has the list of items not taken to be empty, or the remaining space in knapsack to be 0. Note this because we will reflect this when writing our recursive algorithm.

Now, this problem has overlapping sub-problems and optimal substructures which are features any problem that can be solved with dynamic programming should have. We will refer to the graph above in enumerating these two facts.

Overlapping sub-problems: If you look at each level, you will notice that the list of items to be taken for each level is the same. For example in level 1 for both the left and right branch from the root node, the list of items to be taken is [b,c,d] and each node depends on the list of items to be taken as well as the remaining weight. So, if we can find a solution for the list of items to be taken at each level, and store the result (remember memoization) we could reuse this result anytime we encounter that same list of items in the recursive solution.

Optimal substructure: If you look at the graph, you can see that the solutions to nodes 3 and 4 combine to give the solution for node 2. Likewise, the solutions for nodes 8 and 9 combine to give the solutions for node 7. And each of the sub-problems have their own optimal solutions that would be combined. So, the binary tree representation has optimal substructure.

Noting that this problem has overlapping sub-problems and an optimal substructure, this will help us to now write an algorithm using memoization technique for the knapsack problem. If you want a refresher on the dynamic programming paradigm with memoization, see this blog post which I have earlier referenced.

So, we will take a pass at the code for the technique, using the 4 items outlined above. We will be using a knapsack with a weight of 5kg as the maximal capacity of the knapsack. Run the code below to see how it works. I will explain the relevant parts later.

Now let me explain the relevant parts of the code.

Lines 1 – 20: We created the items class with attributes name, value, and weight. Then a __str__() method so we can print each item that is taken.

Lines 22 – 52: This is the main code that implements the dynamic programming paradigm for the knapsack problem. The name of the function is fast_max_val and it accepts three arguments: to_consider, avail, and a dictionary, memo, which is empty by default. To_consider is the list of items that are yet to be considered or that are pending. Remember, this list is what creates our overlapping sub-problems feature along with the avail variable which refers to the weight remaining in the knapsack. The code uses a recursive technique with memoization implementing this as a decision control structure of if elif else statements. The first if statement checks to see that when we come to a node if the pair of ‘length of items to consider’ and ‘weight remaining’ are in the dictionary. If in the dictionary, memo, this tuple is stated to be the result for that node. If not, we move to the next elif statement, lines 30-31 which checks whether this is a leaf node. If this is a leaf node, then either the list of items to consider is empty or the remaining weight is zero (knapsack full), so we say the node is an empty tuple and assign it to result. Then the next elif statement considers the area of branching. Notice from the graph above that the decision tree involves a left branch (take the next item if weight is within limits) and a right branch (don’t take the next item). This elif statement considers the fact where the left branch is not approached and only the right branch. That means for that level, the right node is the optimal result for our parent node. What we do here is call the fast_max_val function recursively to check for child nodes and store the final result as tuples. Then finally, the last else statement. This considers the case where the two branches can be approached. It recursively calls the child nodes of the branches and stores the final optimal result in the variables with_val and with_to_take for the left branch and without_val and without_to_take for the right branch. Then it evaluates which of these two values has the highest value (remembering we are looking for the highest value in each parent node) and assigns that branch to the result.

After the recursion is complete for each node, it stores the result in our dictionary, before calling for any other recursion. Storing this in dictionary is to ensure that when a similar subproblem is encountered, we already have a solution to that problem, thereby reducing the recursive calls for each node.

Finally, the function returns the result as a tuple of optimal values and list of items that were taken.

Lines 55-66: This lines contains the items builder code. It builds the items and then calls the dynamic programming function. Finally when the function returns, it prints out the items that were taken for the optimal solution to the knapsack problem.

If you would like the code to look at it in-depth and maybe run it on your machine, you can download it here.

Now, the beauty of dynamic programming is that it makes recursion scale to high numbers. I want to show that this is possible with the knapsack problem.

I provide another code that implements a random items generator. The test code uses 250 items generated randomly. You can try it on higher number of random items with random values and weights.

The explanation for the codes is similar to the smaller sample above. If you want to take an in-depth look at the code for the higher number of random items, you can download it here.

Happy pythoning.

Python Pow() function For Python Power

The python pow() function, most times called python power function, is a built-in function for calculating the power of a number when raised to an exponent. It comes in very handy several times while doing mathematical operations.

python pow() function

 

The syntax of the python power function is pow(base, exp[, mod]) where base is the number whose power you are looking for, exp is the exponent to which you will raise the base or number, and the optional mod is a modulus integer you might wish to use for the result. In simple terms, it returns the base to the power of the exp.

This is a very simple function to use. In fact, it is one of the simplest I have found in the built-in python functions.

Let’s illustrate its usage with examples.

1. When base is positive and exp is positive.

This is simply the act of raising the base, or number, to the exponent, exp. In literal terms, base ** exp. Consider the example below:

The base is 4 and the exponent is 2. So 4 raised to power 2 gives 16. Very easy to read.

2. When base is negative and the exp is positive.

This is similar to the above. Just raise the base to the exponent.

3. When base is positive or negative but the exponent is negative.

In this case, when the exponent is negative, the result is no longer an int type but a floating point type. Note this difference.

This is correspondent to when you are raising fractions by an exponent. That is why I so love python. You can use it to do so many different things. It gives just the right results for any calculation you assign to it.

4. When you use it with the three arguments, pow(base, exp, mod)

When you use the third optional argument, mod, which means modulus, you are doing an operation which takes the modulus of the result from raising the base by the exponent. Let us illustrate with some examples:

In the above code this is what is happening. The pow() function first raises 4 to the power of 2, the exponent. The result is 16. Then it does 16 modulus 5 which is 1. That’s it.

Do you know that you can even do the modulus of fractional results? Yes, this function gives you the power to do that. Let’s illustrate by raising 4 by negative 2, -2, and get the modulus by 5.

    
base = 4
exp = -2
modulus = 5
n = pow(base, exp, modulus)
print(n)

This new power was introduced in python 3.8. The embedded python interpreter is upgraded only to python 3.6. So, you can run it on your machine and see that 4 raised to the power of negative 2, -2, is 0.0625 and when modulus 5 is called on the result it gives 1.

That’s it. I had a swell time introducing you to this powerful python power function. Play and use it to your heart’s delight. Leave a comment about your findings. I would love to see some comments about this powerful function.

Happy pythoning.

Parsing HTML Using Python

HTML, also known as Hypertext Markup Language, is used for creating web pages and is a standard markup language. If you have ever seen the code of any website or blog, you most probably was reading the HTML code of the page. In this post, we want to parse some HTML strings. By parsing, I mean analyzing the strings syntactically for specific symbols based on the components in the string.

python html parser

 

We will be using the python class HTMLParser that is contained in the module html.parser for this activity.

All instances or objects of the HTMLParser class are able to parse the markup of text that contains HTML and identify the tags and references that are contained in the text.

To use the HTMLParser class we need to first import it from html.parser module like this: from html.parser import HTMLParser. After that, any user has to inherit from the HTMLParser class and then override any method he desires to use.

But before we start showing that in code, I will first describe some of the methods of the HTMLParser class we will be overriding.

Methods of the HTMLParser class

The HTMLParser class has several methods which are divided into methods that belong to every instance of the class and methods that are called when HTML markup is encountered.

1. Methods of HTMLParser instance.

The two methods that are of interest to us for this post are the HTMLParser.feed() method and HTMLParser.close() method.

The HTMLParser.feed method has the syntax HTMLParser.feed(data) where the data is the text containing markup you want to parse. Processing takes place provided there are complete elements in data and the data is buffered when incomplete until the close() method is called.

The HTMLParser.close method processes all data as if the end-of-file marker has been encountered, thereby closing processing. This method can be redefined in user defined classes but when redefined, it must also be called again.

So, that’s it. Let’s move on to the methods that are used for handling encountered markup.

2. Methods for operating on HTML markup.

The HTMLParser class has several methods for operating on HTML markup, but I will deliberate on only those of interest in the code I will be writing in this blog. They are the common ones. When you know the basics, you can apply other methods to your choosing.

The methods are:

a. Method to handle start tags.

The syntax for this method is HTMLParser.handle_starttag(tag, attrs) and when a start tag in the markup is encountered, it is called with the tag and its corresponding attributes as arguments. So, in your handler code you can directly reference the tag. The attributes are denoted as a tuple of name, value like (name,value) enclosed in a list. So, you can use your for loop or any other method to extract the items in the tuple.

b. Method to handle end tags.

The syntax is HTMLParser.handle_endtag(tag) and the tag is always converted to lower case when markup is encountered.

c. Method to handle tags that have no corresponding end tags, or empty tags.

There are some HTML tags that do not have corresponding end tags like the <br \> tag. This method is designed to handle those. These tags are styled like in XHTML. The syntax for the method is HTMLParser.handle_startendtag(tag, attrs). The structure of the tags and attributes arguments are similar to that of the method to handle start tags.

d. Method to handle comments.

HTML markup contain comments which can be parsed. This method is used to handle them. The syntax is HTMLParser.handle_comment(data) where data is any text that is contained within the <!—Data --> comment.

e. Method to handle data between tags

When you have text that have to be rendered that exist between start and end tags, this method is used to handle those text. The syntax is HTMLParser.handle_data(data) where data is the text that is contained in between start and end tags.

Now that the methods are outlined, let me show how to apply them.

Let’s write a simple code that handles start and end tags when they are encountered, as well as tags that do not have corresponding end tags, or empty tags. The code will ignore comments when encountered in the markup. You can run the code to see the output.

Let me explain relevant parts of the code based on their lines. On line 1 I imported HTMLParser and then on line 3 I created a class, MyHtmlParser, that inherits from HTMLParser. Then inside the class I overrode the methods to handle start tags, empty tags, and end tags, along with comments. For the comments methods, I told the code to ignore all comments. For the end tag handler, I just printed the name of the end tag but preceded by the text, End. For the start tag and empty tags, I first printed out the tag and then printed out the attributes and value, taking a cue from the fact that the attribute and values are stored as pair of tuples.

That’s it. The driver code starts from lines 30 to 35 where I created an instance of MyHtmlParser class called parser and then fed the instance the html text in the variable, text, through the instance’s feed method.

That code was cool but plain simple.

Let’s take another simple code. This time a code that would count the tags and their frequencies in a markup. This promises to be more cool. Just run it to see how it goes.

This time in the code, since we are counting tags, we leave out the end tags handler since when we have identified a start tag that is enough so as not to double count. What the code does is that when any tag is identified, it checks to see if it is in the dictionary, parser_dict, and if it is not in it, it adds it and increments the count by 1 but if it is already in the dictionary, it only increments the count by 1. Just cool.

I hope you enjoyed my code for today. I promise to be bringing you more interesting code that shows how python works and how you can use python to its fullest. Just subscribe to the blog to get updates.

Happy pythoning.

Dynamic Programming In Python: Using Fibonacci Sequence

Dynamic programming is a programming method that aims to optimize solutions to problems. It was developed by Richard Bellman in the 1950s and has since become popular. In dynamic programming, we aim to break down a complicated bigger problem into simpler sub-problems in a recursive manner. If in breaking down the problem into sub-problems we encounter those sub-problems again, the problem is said to be overlapping. Then we try to find the optimal solutions to the sub-problems which solutions will then be combined with other sub-problems to get the overall solution. Thereby, the problem is said to have an optimal substructure.

dynamic programming in python

 

Most times when we have solved the smaller sub-problems we store the results of the solution so that when similar sub-problems are encountered, the results can be reused. This concept in dynamic programming is called memoization. Memoization not only speeds up computing, it also saves on computer resources.

In dynamic programming, we will be using the memoization technique a lot.

To illustrate how dynamic programming is used in practice let us take a problem that has both overlapping sub-problems and an optimal substructure. That is the problem of finding the Fibonacci sequence. We will be using the Fibonacci sequence as an example of dynamic programming in python.

The first ten Fibonacci sequence are: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34. Each item in the sequence is derived from the sum of the two previous items except for the case when n is 0 or 1. Then in those instances, when n is 0 the sequence is 0 and when 1 the sequence is 1.

Traditionally, the Fibonacci sequence can simply be solved using a recursive algorithm. Just like the one below:

In the code, lines 4 and 5 are used to state the cases where n is either 0 or 1. Then if n is neither of these, then recursively we call the sum of the previous two Fibonacci numbers.

This solution or algorithm is simple but not elegant. This is because the complexity of the algorithm is exponential. Therefore, it does not scale for large Fibonacci numbers. If you call it on n being 120, it would take more than 250,000 years to finish. Quite some age. Therefore, it would do with some optimization.

Now, let’s see how dynamic programming with python can help.

First, we state that dynamic programming will be applicable when a problem has overlapping sub-problems and that each sub-problem has an optimal solution which will be combined to give the overall solution. That is what we are doing in the above recursive calls. Let’s illustrate it with some pictures. For example when we are looking for the Fibonacci of 6. Look at the number of calls that are generated.

python fibonacci sequence diagram


Each call is a sub-problem. You can see that fib 2 was called 5 times while fib 3 was called 3 times before getting the final result. Since we can generate solution for each of the sub-problems, it would be nice to just generate the solution once, an optimal solution for the sub-problem, and use that result for every time that sub-problem was called. The operation of storing this result and using it for subsequent sub-problems is called memoization. Creating a memo of previously encountered problems that was solved.

Using this concept, we can now write a second code for the Fibonacci sequence based on dynamic programming using python.

As you can see from the dynamic programming code above, we are using memoization with a dictionary, fib_dict, to store already known results of sub-problems and these results are called when similar sub-problems are encountered. In line 9 we created the dictionary and gave it the initial values of the Fibonacci sequence. Then in line 11 we called the Fibonacci function, fast_fib. Lines 1 to 7 contains the code for the Fibonacci computing function, fast_fib. In the code, we first check to see if the key, n, is already in the dictionary, if it is, we return it but if it is not we compute the Fibonacci sequence recursively and store the result so that it can be used in subsequent calls. This is done consecutively as any n is needed. Finally, it returns the value for the key, n, in the dictionary where n is the Fibonacci number we are looking for.

This dynamic programming implementation runs in linear time and it scales considerably. I just so love it. It is much better than the earlier recursive Fibonacci code.

I hope you now know how dynamic programming works and how to implement it in other problem spaces.

Happy pythoning.

Python Range() Function: The Complete Guide

Imagine you want to loop over some numbers that is defined by a sequence. What do you do? Start creating the list of numbers by hand? Nope, not in python. Python provides a convenient data type that does just that operation for you and that is the python range type that produces a sequence using the python range function.

python range function

 

What is the python range function?

The python range function represents a range type that produces an immutable sequence of numbers. The syntax of the python range function is of two types: range(stop) and range(start, stop[, step]). The first, with only the stop argument is used when you want to produce an immutable sequence of numbers that follow themselves sequentially while the second is used when you want to explicitly define the sequence with a starting integer and specify the steps to take.

The python range function produces a range object which is a sequence. Therefore, you can extract the elements by casting it to a list or using it in a for loop.

Let’s take the syntax with examples.

1. Range(stop) examples.

With the python range function specifying only the stop argument, you are asking the function to create an immutable sequence that starts from 0 and ends at the number just before the stop integer, creating integers consecutively. That is, the stop integer is not included in the range object that is created but acts as a boundary.

For example, run this code to see how it works.

You can see from the code that I first denoted the range object created by specifying the stop to be 20. Then I extracted the numbers in the range object by casting it to a list which prints out a sequence that begins from 0 to 19; remember the stop, here 20, is not included in the sequence that is created.

Note that the stop argument cannot be zero or no sequence will be created.

We can bring out the elements of the sequence using a python range for loop but that would be for the next examples on the second syntax.

2. range(start, stop[, step]) examples

This is the second way of using the python range function but using this you want to customize the sequence that is produced. The arguments to the function in this syntax are start, stop, and an optional step. All the arguments must be integers. The default for the start argument is 0 and the default for the step argument is 1. By changing the defaults, we can customizes the sequences we want to create.

For example if we have the arithmetic sequence: 1, 4, 7, 10, 16. How do you create this sequence using the range function? Simple. You can see that the sequence starts from 1. So in the range function, we denote start as 1. Also, you can see that the last item in the sequence is 16 and since stop is not included in the sequence but acts as a boundary, then in the range function we denote the stop as 17. Then we can see that the next item in the sequence from the earlier one has a difference of 3. So, we denote the step in our range function as 3. Therefore, the range function we will use is: range(1, 17, 3). This is how the code could be written:

When you run it, you can see that it reproduces just our arithmetic sequence.

Now there are some important points you need to note about this syntax using three arguments.

You can have both positive steps and negative steps. That means, as you can create a sequence going forwards, you can also create the sequence going backwards by just making the step a negative value. The example arithmetic sequence I gave above is a sequence that goes forwards. Now, we can make it go backwards by changing the start, stop, and step values.

Notice that the last item is 16 and going backwards we want to start from it. So, now our start will be 16. Since the first item above was 1 and going backwards it will be our last item. So, our stop will be just the boundary of 1 which is 0. Then our step is going backwards three steps, and therefore, -3. So, we would create our range function as: range(16, 0, -3). Run the code below to see it.

What I did was switch the start and stop to reflect the backwards movement and made step to be negative.

If you are confused, just notice that the function uses the formula: r[i] = start + step*i, where r[i] denotes each item in the sequence.

Ranges are sequences of integers

Yes, just as I said before, a python range object is just a sequence of integers. That means you can carry out sequence operations on range objects. If you want a refresher on what sequences are, see this post on iterables and sequences. But there are some sequence opertions that python ranges do not support.

First, let me outline some of the sequence operations python ranges support:

  1. You can do slicing on range objects.
  2. You can get the length of range objects.
  3. You can use them in for loops (as I showed above)
  4. You can use them in ‘in’ and ‘not in’ operations
  5. You can call the max and min functions on them.

Now, let us buttress the operations above with examples. For example, if we want to produce a python range object that denotes all the odd numbers from 1 to 10. Here is code that does it and that illustrates all the sequence operations I outlined above.

You can see that the operations I outlined above can be done with range objects.

But two sequence operations python range objects cannot carry out are concatenation and repetition. These two operations goes against the concept of ranging because they would create a new range out of the original range if permitted, thereby distorting the range itself.

Advantages of using a range object or the range function

While for small sequences you can use a list or tuple and hand code them, for large sequences, this might not be feasible, so using a range object with set start, stop and step parameters would be more adequate.

Also, the memory footprint of a range object is more efficient and smaller than that of lists or tuples. So, it would be more efficient to use a range object for sequences of numbers that you would have to call when needed.

That’s all folks. Now you have all you need to start using range objects. Use them to your heart’s delight.

Happy pythoning.

Matched content