Search

Classes For Graphs and Directed Graphs In Python: Graph Theory

In computer science and mathematics, graphs are ubiquitous. They are just everywhere. We use graphs to solve a lot of problems that involve relationships. Since 1735 when the Swiss Mathematician, Leonhard Euler, used what we now know as graph theory to solve the Seven Bridges of Königsberg problem, graphs have become a brand name of sorts. That is why I decided to write a post on graphs and explore graphs in subsequent posts.

python graphs and directed graphs

 

What are graphs?

In simple terms, graphs are structures used to represent the relationship between objects (called vertices or nodes) where two objects (or nodes) have an edge connecting them if they are related. Diagrammatically, they are depicted with a set of dots or circles for the objects or nodes, and related objects are joined by lines called edges.

The graph below has 6 nodes or vertices, and 7 edges.


 

The edges of a graph may be directed or undirected.

I will be writing code for both directed and undirected graphs. What made me attracted to writing code on graphs was because they are used in every area of life. From scientists to businesses, graphs are used to model solutions to problems.

So, let’s start with writing classes for graphs and we will implement them.

First, we’ll create a class for a Node and an Edge.

A node is just an object in a graph. One attribute every object has is a Name. So, we’ll give our node a name attribute to start with. Here is the code for the Node class.

    
class Node(object):

    def __init__(self, name):
        ''' assumes name a string '''
        self.name = name

    def get_name(self):
        return self.name

    def __str__(self):
        return self.name

Every instance of a Node, as you can see from the code, has a name and each instance has a method, get_name, which you can use to retrieve the name.

An edge is a relation connector between objects. If two objects are connected to each other by a relationship, they will have an edge between them. Edges can be directed or non-directed. Let’s model the Edge class to start with.

    
class Edge(object):

    def __init__(self, src, dest):
        '''assume src and dest are nodes '''
        self.src = src
        self.dest = dest

    def get_source(self):
        return self.src

    def get_destination(self):
        return self.dest

    def __str__(self):
        return self.src.get_name() + '-->' + \
            self.dest.get_name()

From the code you can see that each instance of an Edge has a source node, self.src, and a destination node, self.dest. On creation of a node the source and destination nodes have to be passed as arguments to the constructor, __init__. Then I added a special method for representing an Edge as a string of source and destination nodes, __str__(). This would make for easy printing.

Now that we have the Node and Edge classes, let us go on to model the directed graphs and undirected graphs.

Directed graphs in Python code

A directed graph is a graph in which edges have orientations. The relationship in directed graphs goes one-sided and never both way. The edges are represented by arrows.

A simple class for a directed graph might be written in the following way:

    
class Digraph(object):
    # nodes is a list of the nodes in the graph
    # edges is a dict mapping each node to 
    # a list of its children 
    def __init__(self):
        self.nodes = []
        self.edges = {}

    def add_node(self, node):
        if node in self.nodes:
            raise ValueError('Duplicate Node')
        else:
            self.nodes.append(node)
            self.edges[node] = []

    def add_edge(self, edge):
        src = edge.get_source()
        dest = edge.get_destination()
        if not (src in self.nodes and dest in self.nodes):
            raise ValueError('Node not in graph')
        self.edges[src].append(dest)

    def children_of(self, node):
        return self.edges[node]

    def has_node(self, node):
        return node in self.nodes

    def __str__(self):
        result = ''
        for src in self.nodes:
            for dest in self.edges[src]:
                result = result + src.get_name() + \
                    '-->' + dest.get_name() + '\n'
        return result[:-1] # remove last newline

The class, Digraph, represents a class for objects of a directed graph. If you look at the constructor, we are representing all the nodes in the graph as a list, while the edges are represented by a mapping of nodes to child nodes which mapping goes only one way. Therefore, a dictionary data structure was used for this mapping. To have a graph, it needs nodes. To add a node, we use the method add_node that takes a node as its sole argument. To add a node, we need first to check if the node already exists in the list of nodes and if it does, the method raises a ValueError. If not, it then appends the node to the list of nodes in the graph and then creates a mapping for that node with its values as an empty list that would later be populated when edges are added. To add an edge to the graph, we use the add_edge method. We first initialize the source and destination nodes of the edge, and before adding the edge we check that the source and destination nodes are already in the list of nodes for the graph. If they are not in the list, we raise a ValueError exception. If no exception is raised, a directed edge is created with the source, src, mapped to its value, the destination node, dest. The class also has complementary methods, children_of, that would bring out a list of the nodes connected to that node as source, and also another method, has_node, that returns a Boolean value after evaluating whether the node in question is in the list of nodes for the graph.

That sums up our directed graph, Digraph, class. Now, let’s show that the code works. Let’s implement it by creating an instance of a directed graph, or Digraph. The Digraph instance I will be creating will be based on the directed graph below with 5 nodes and 6 edges. 

python directed graph

 

The only new code is the driver code that creates a Digraph instance. For the sake of brevity, I would recommend that you read the driver code starting from line 67 down to line 106. It is really an exciting code. I hope you appreciate it.

Now the question I asked myself is: why create a class for a graph and not just write code direct? This is because I would be reusing the code in the future. So, we will be coming back to this code for solving problems involving graphs in the future. Maybe you could bookmark this code for the Digraph class. You can download the file here, directed_graph.py.

Undirected graphs or simple graphs in Python code.

An undirected graph or graph for short is a connection between a pair of nodes using their edges. The edges can go both ways which distinguishes it from directed graphs that have orientations.

Now while writing the code for undirected graphs, I ran into a dilemma about inheritance. I was stuck between which graph should inherit from which. Should a directed graph inherit from a undirected graph or should it be vice versa? I decided that it was best for a graph to inherit from a directed graph. This was because instances of graphs can substitute for instances of digraphs and still add one more behavior by making the relationship go the other way. But instances of digraphs cannot stand as substitutes for instances of graphs; digraphs relationship goes only one way. Therefore, I decided to make digraph the superclass and graph the subclass.

Now, this is the code for the class graph.

    
class Graph(Digraph):

    def add_edge(self, edge):
        Digraph.add_edge(self, edge)
        rev = Edge(edge.get_destination(), edge.get_source())
        Digraph.add_edge(self, rev)                                            

Notice that the class, Graph, is inheriting from Digraph so it shares the same attributes with Digraph instances but it only overrides the add_edge method of Digraph. In the add_edge method of Graph I made it that the relationship can go both ways i.e every node in a relationship or edge is both a source and destination node for that edge.

So, for a little implementation that creates an instance of a Graph, I will be modeling the Graph pictured below:

python graph example

 

Run the code and notice the differences between this instance of a Graph and instances of a Digraph. The driver code that creates the Graph instances starts at line 73. You can alternatively download the code and run it on your own machine, graph.py script, so you can bookmark it.

I hope to see you in the future when we begin solving problems with graphs like the traveling salesman problem. To receive updates when I post new articles, just subscribe to my blog.

Happy pythoning.

How To Reverse String In Python: 3 Techniques

After writing the post on reversing lists in python, I got a good response that I decided it would also be fine if I approach a similar concept in python: how to reverse a string in python.

python reverse string

 

One thing that makes people find this subject difficult is because python does not have a string method that is specifically built for reversing strings just like we have for lists. If there was one, that would have made our task easier because it would be optimized. So you have to be creative in looking for ways to reverse python strings. In this post, I outline three methods you can reverse strings in python as well as their timing characteristics.

First, using the python slice technique

The python slice technique, as I explained in the post about reversing lists through slicing, involves taking a slice of a sequence (and strings are also sequences) through definite start, stop, and step parameters. The slice notation in python is [start: stop : step] where start is where you want to start the slice, stop is at the index you want to stop the slice and step is the steps you take through the index while iterating through the items in the string. To reverse a string using the python slice technique, you just need to use this code on the string: string_name[ : : -1].

What the code above does is to copy the string, string_name, from the beginning to the end backwards.

Here is an example so you can see it for yourself.

Nice and easy not so? This method does not change the original string. In fact, as you must know, strings are immutable.

This is the method I prefer for reversing python strings.

Second, using the built-in reversed function.

I explained the reversed function on the post for reversing lists. But one more explanation would be succinct here. The syntax for the reversed function is: reversed(seq). As you can see, the function takes any sequence and reverses it. A string is also a sequence as I explained on the iterables post. When the string is reversed the function returns an iterator. What you do here is cast the iterator to a string. I think that is what gives this function an overhead, the process of casting the iterator to a string, otherwise it is very good and fast. You use the str.join(iterable) method of the string class to cast the iterator to a string.

Here is code that shows how to use the function and also cast it to a string.

This method is very succinct and readable across all levels of python skill.

Third, using a for loop and storing the characters backwards.

With this method all you need to do is use a for loop to iterate through the characters in the string, then you store the characters backwards in a new string. Very convenient and easy to understand.

Here is the code:

I hope you found the code to be fun.

Now, the question most persons ask is: Which is faster or which uses less system memory?

Let’s answer that question by timing the execution of each of the string reverse techniques in python and decide on which is faster or more optimized.

Timing all three techniques.

Try running the following code:

When you run it you will notice that the python slice technique took considerably lesser time than all the others while the for loop took more time. That is why I use the slice technique to reverse my strings.

There was a time when someone told me that during an interview question the interviewer told him that the slice technique is not optimized for reversing strings. I laughed. I have not yet seen anything better than the python slice technique for reversing strings. For now it is the fastest of all the methods I know of and I have outlined the three useful I use here. I use the for loop when I am lazy and just want to be unpythonic.

So, thanks for reading. I believe that now you have techniques you can use to reverse your strings. Leave a comment below if you have any. Also, do not fail to subscribe to my blog so you can be receiving useful updates the moment I post new articles.

Happy pythoning.

Breakthrough 3D Printing Of Heart For Treating Aortic Stenosis

When a narrowed aortic valve fails to open properly and thereby the pumping of blood from the heart to the aorta is obstructed, this might result in a condition called aortic valve stenosis. Aortic stenosis is one of the most common cardiovascular conditions in the elderly and affects about 2.7 million adults over the age of 75 in North America. If the doctors decide that the condition is severe, they may carry out a minimally invasive heart procedure to replace the valve. This procedure is called transcatheter aortic valve replacement (TAVR). But this catheterization procedure is not without some risks which might include bleeding, stroke, heart attack or even death. That is why it is important that the doctors take all care to reduce the risks. The TAVR procedure is less invasive than open heart surgery to repair the damaged valves,

3D printing of heart

In a new paper published in Science Advances, a peer-reviewed scientific journal published by the American Association for the Advancement of Science (AAAS), some researchers from the University of Minnesota along with their collaborators have been able to produce a new technique that involves 3D printing of the aortic valve along with creating lifelike models of the aortic valve and surrounding structures which models mimic the look and feel of the valve. These 3D printing would possibly help reduce the risks for doctors who want to carry out a TAVR procedure on a patient.

Precisely, they 3D printed a model of the aortic root. The aortic root is a section of the aorta that is closest to the heart and attached to the heart. Some of the components of the aortic root include the aortic valve, which is prone to aortic stenosis in the elderly, along with the openings of the coronary artery. The left ventricle muscle and the ascending aorta which are close to the aortic root are also not left out in the model.

The models include specialized 3D printing soft sensor arrays built into the structure that prints the organs for each patient. The 3D printing process is also customized. The authors believe that this organ model will be used by doctors all over the world to improve the outcomes for patients who will be subject to invasive procedures when treating aortic stenosis.

Before the models are produced CT scans of the patient’s aortic root are made so that the printing will mimic the exact shape of the patient's organ. Then specialized silicone-based inks are used to do the actual printing in order to match the exact feel of the patient's heart. These inks were specially built for this process because commercial printers in the market can print 3D shapes but they cannot be able to reflect the real feel of the heart’s organs which are soft tissues. The initial heart tissue that were used for the test of the 3D printers were obtained from the University of Minnesota's Visible Heart Laboratory. The researchers found that the specialized 3D printers produced models that they wanted, models that mimic the shape and the feel of the aortic valve at the heart.

To watch a video of how the 3D printers work, I encourage you to play the video below. You would find it interesting.


The researchers are happy with what they have achieved.

“Our goal with these 3D-printed models is to reduce medical risks and complications by providing patient-specific tools to help doctors understand the exact anatomical structure and mechanical properties of the specific patient’s heart,” said Michael McAlpine, a University of Minnesota mechanical engineering professor and senior researcher on the study. “Physicians can test and try the valve implants before the actual procedure. The models can also help patients better understand their own anatomy and the procedure itself.”

These models will surely be of help to physicians who will use them to practice on how they will carry out their catheterization procedures on the real heart. Physicians will soon have the ability to practice beforehand on the size and placement of the catheter device on patients before carrying out the real procedure thereby reducing the risks involved. One good thing about the integrated sensors that are fitted into the 3D models is that they will provide physicians with electronic pressure feedback which will guide them in determining and selecting the optimal position of the catheter when being placed into the aorta of a patient.

But the researchers do not think these are the only use cases for their findings or the models. They aim to go beyond that.

“As our 3D-printing techniques continue to improve and we discover new ways to integrate electronics to mimic organ function, the models themselves may be used as artificial replacement organs,” said McAlpine, who holds the Kuhrmeyer Family Chair Professorship in the University of Minnesota Department of Mechanical Engineering. “Someday maybe these ‘bionic’ organs can be as good as or better than their biological counterparts.”

I think these are laudable futuristic goals. If they could achieve their ambition, then McAlpine would be solving a problem that gives sleepless nights to many physicians who have to operate on elderly patients with weak aortic valves.

Because this is a problem-solving innovative solution to a challenging problem, I decided to include it in my blog. I hope you enjoyed reading about the achievements of McAlpine and his colleagues. I wish that they go further than just helping physicians have 3D models but be able to make those models replace weak natural organs.

In addition to McAlpine, the team included University of Minnesota researchers Ghazaleh Haghiashtiani, co-first author and a recent mechanical engineering Ph.D. graduate who now works at Seagate; Kaiyan Qiu, another co-first author and a former mechanical engineering postdoctoral researcher who is now an assistant professor at Washington State University; Jorge D. Zhingre Sanchez, a former biomedical engineering Ph.D. student who worked in the University of Minnesota’s Visible Heart Laboratories who is now a senior R&D engineer at Medtronic; Zachary J. Fuenning, a mechanical engineering graduate student; Paul A. Iaizzo, a professor of surgery in the Medical School and founding director of the U of M Visible Heart Laboratories; Priya Nair, senior scientist at Medtronic; and Sarah E. Ahlberg, director of research & technology at Medtronic.

This research was funded by Medtronic, the National Institute of Biomedical Imaging and Bioengineering of the National Institutes of Health, and the Minnesota Discovery, Research, and InnoVation Economy (MnDRIVE) Initiative through the State of Minnesota. Additional support was provided by University of Minnesota Interdisciplinary Doctoral Fellowship and Doctoral Dissertation Fellowship awarded to Ghazaleh Haghiashtiani.

You can read the full research paper, entitled "3D printed patient-specific aortic root models with internal sensors for minimally invasive applications," at the Science Advances website.

How To Reverse List In Python: 4 Techniques

Very often I get people asking me to write a post on how to reverse a list in python. This is because this question often comes up in interviews. So, to oblige them, I have decided to write on four good and tested ways you can reverse a list in python. I also show their timing so that you can choose the best for your needs.

 

python reverse list

The built-in python reversed function

The syntax of the built-in python reversed function is reversed(seq). You can see that it takes in any sequence as argument. Sequences are lists, strings, tuples etc. For a refresher on sequences, see my post on iterables. The function returns an iterator. Remember that an iterator is an object that has elements but in order to extract the elements you need to cast it to list or call the next method. Most times, you cast it to a list to get out the elements. But casting afterwards to a list for this method of reversing could be an overhead cost for the method although it is easy, and uses substantially less memory unless you are casting. This method is ideal for times when you are dealing with very large lists and just want to use the elements of the reversed list when needed rather than using all at once. In this instance, you would need to use a python for loop.

Let’s take an example:

You can see from the above that I had to cast the iterator from the python reversed function to a list to get out the results. That could be an overhead as we’ll see later.

The python slice technique

Slicing a sequence is one of the ubiquitous techniques you can find with python lists and sequences. The syntax of slicing is [start:stop:step] where start is the index you want to start the slice, stop is where you want to stop the slice and step is the steps you want to take when iterating through the list to create the slice. To reverse a list using the python slice technique, you just need to use this statement: list_name[::-1], which is a shorthand for saying copy the list and then walk through it backwards.

Here is an example:

The advantage of this technique is that it is adaptable for any sequence and not just for lists. Some people claim that it is not readable but I find that argument obscure. Slicing is common in python even for the beginner. The only disadvantage I see with the python slice technique is that it uses up memory if you have a large list. This is because to create the reversed list, it needs to copy the original list and then reverse it. This sequence takes up a large chunk of memory. But when you want the original list to remain unchanged, this technique is good and recommended.

The python list reverse method

The python list reverse method is a method of the list class. The syntax is list.reverse(). In my opinion, it seems to be the easiest since it is built for lists and seems to be the fastest so far. But we will consider that in the timing section below. Unlike in the built-in python reversed function, it does not create any overhead and unlike the slicing technique, it does not require large chunk of memory even when working with large lists. It is more optimized for reversing python lists.

The advantageous fact about it is that it reverses in place. But if you want to make use of the original list after reversing, then this technique is not for you.

Here is an example:

I usually use this technique whenever I am reversing lists but if I need the original, I use the slice technique. Just to make you know.

Reverse list programmatically by swapping items

Now the last method you can use is to reverse the list programmatically by swapping items in place. You can write your own code that iterates through the elements of the list and swaps the elements in place. Here is a sample code:

This code can run fast but is not optimized for large lists. It swaps the elements in place and modifies the original list. It is worse than all the python built-in methods.

Timing the methods.

Most times when we are dealing with large lists, we want something that works very fast and doesn’t use much memory. Although with the built-in timing functions in python we cannot calculate for memory usage, but we can find out how long it takes each of the techniques above to run. We will need to import the timeit module to do this.

Here is a sample code for all three built-in methods except the programmed reverse list swapping method. The swapping technique takes a longer time for large lists that is why it is excluded.

When you run the code above, you will see that the list reverse method takes the shortest time of all three methods. Overall, its running time is 12 times lesser than the reversed method. The reversed function took longer time because of the list casting overhead. If we had a for loop, it would have taken less time. The slicing technique comes second place. So, that is why I use the python list reverse method often when reversing lists.

The list reverse method works better because it has been optimized by the developers of python for lists. I believe they look to these things that is why it was made to be a method of the list class.

So, now you have the options you can choose from while reversing lists. Use any to your heart’s desire.

Happy pythoning.

The 0/1 Knapsack Problem In Python Simplified

When I wrote a post about an idea I had at a bank about making change from $5 and $8, a reader wrote me that this was the knapsack problem. I decided to research it. I discovered that although my post was not a knapsack problem, it was somewhat similar because it involved resource optimization. So, intrigued by the knapsack problem, I decided to write code on the 0/1 knapsack problem.

0/1 knapsack problem in python

 

What is the 0/1 knapsack problem?

The knapsack problem was stated using a burglar and how difficult it is being a burglar especially when you are greedy. Imagine a burglar breaks into a home carrying a knapsack with a fixed capacity. What he can steal is limited by the capacity of the knapsack. But the problem is that the items in the home have more value than what his knapsack can carry. So, he has a decision problem. What would he decide to take that would fit into his knapsack? In this problem, he cannot take pieces of items but he either chooses to take the item completely or leaves it (that is why it is called the 0/1 knapsack problem). How does the burglar decide on what to choose?

Let’s take an example of the items the burglar might find in the home. For example, the items he might need to choose from might be a clock, painting, radio, vase, book and computer with values and weight like in the table below:

Name Value Weight Value/Weight ratio
Clock 175 10 17.5
Painting 90 9 10
Radio 20 4 5
Vase 50 2 25
Book 10 1 10
Computer 200 20 10

Which items would he decide to take? We are going to analyze what options the burglar can make.

Why being greedy might not be optimal here.

The burglar has three dimensions of greed in making a choice. He would choose the best item first, then the next best, and so on until he reaches the limit of his knapsack. But what would best mean for him? Could it mean the most valuable, the least weight, or the highest value/weight ratio? Let’s give a value to the capacity of his knapsack and say he can only take items up to 20 kg. So, given a knapsack of 20 kgs, what combination of items can he take?

A stupid burglar or one in a hurry would just pick the computer and say that would give him the best value. But that is not the case. An intelligent burglar would go through his value system and decide what is best for him. But he doesn’t have all the time in the world. So, on the spur he chooses what is best based on his greed and picks accordingly. He has three dimensions of greed to choose from: based on value, based on least weight, and based on the density or value to weight ratio of the items.

I have written here a little code based on the greedy algorithm. It is named my_greed.py.To make the decision, the burglar would have to sort the items ranking them in order and picking each item until he exhausts his knapsack. You can download the code and run it on your machine and test it to see the solutions he would get for his greed.

If you run it you would get the following results. Based on values, if he picks the clock, radio and book, he would steal items that are worth 275. If he decides based on weight, he would pick the book, vase, radio and painting which are all worth 170. If he decides based on density or value to weight ratio, he would pick the vase, clock, book and radio which are all worth 255. So, the best solution for the burglar on the spur is to pick based on value where he gets a total value of 275. Unfortunately, because of greed, some burglar would just pick the computer which is only worth 200.

But choosing based on greed is not always the optimal solution. The burglar wouldn’t be able to consider all the possibilities of his choice. So, that is why I am not explaining the code for greed on this post. You need to study it to understand it yourself. I want to dwell on using the optimal solution.

The optimized algorithm for the knapsack problem.

I thought carefully about this problem and decided that the best and optimal solution for the burglar is to do the following (but he wouldn’t have all the time in the world):

1. Enumerate all the possible combination of items.

2. Remove all the combinations whose weight exceeds the capacity of the knapsack.

3. From the remaining combinations, choose any whose value is the highest.

So that is the focus of this blog. Using an optimal algorithm in getting a solution to the 0/1 knapsack problem. We will go through all the steps above one after the other using code. I hope you ride along.

First, we need to create the class for the items.


# create the items class
class Items(object):

    def __init__(self, name, value, weight):
        self.name = name
        self.value = value
        self.weight = weight

    def get_name(self):
        return self.name

    def get_value(self):
        return self.value

    def get_weight(self):
        return self.weight

    def __str__(self):
        result = f'{self.name}: Value = {self.value}. 
                          Weight = {self.weight}.'
        return result    

In the constructor, each item instance has a name, value and weight. Then there are the getters for the attributes. Then the last method is the __str__() method that specifies how to represent each object on the output.

Then the next thing to do is to build the Items into a list. With the list of items, we can create a superset later.


# build the items in a list
def build_items():
    names_list = ['Clock', 'Painting', 'Radio', 'Vase', 
                                   'Book', 'Computer']
    values_list = [175, 90, 20, 50, 10, 20]
    weight_list = [10, 9, 4, 2, 1, 20]
    collection = []
    for i in range(len(names_list)):
        collection.append(Items(names_list[i], values_list[i], 
                                              weight_list[i]))
    return collection

I created a separate list for each attribute and then in the for loop I created each item instance with their data. Then returned the list, collection.

Now that we have all data items in a list, we now need to enumerate all possible combinations of the items.


# create the powerset
from itertools import combinations, chain

def powerset(iterable):
    s = list(iterable)
    # powerset removes the empty set
    return list(chain.from_iterable(combinations(s, r) 
                         for r in range(1, len(s)+1)))

Notice that I imported the combinations and chain methods from itertools to create the powerset. It’s very easy. If you want an explanation of what was involved in creating the powerset, just read this blog post on combinations.

Then the last two activities to do is to remove all combinations whose total weight exceeds the capacity of the knapsack. When we have done that, from the remaining combinations, find out which of them gives us the maximum value because now all the remaining combinations can fit into the knapsack. This code does both activities.


# the main code
def choose_best(pset, max_weight):
    best_val = 0.0
    best_set = None
    for items in pset:
        items_val = 0.0
        items_weight = 0.0
        for item in items:
            items_val += item.get_value()
            items_weight += item.get_weight()
        if items_weight <= max_weight and items_val > best_val:
            best_val = items_val
            best_set = items
    return (best_set, best_val)

Now let me explain the code a little. The arguments to the choose_best method are powerset and max_weight, which is the weight of the sack. I then initialized best_val to capture the highest value among the combinations and best_set, to capture the best combination. Then in the outer for loop, I initialized the variables items_val and items_weight which are used as total values and total weights for each combination. Then in the inner for loop I iterate through each of the items in any chosen combination, getting their total value and total weight. Then, what I did next is check for whether the total weight, items_weight, is less than or equal to sack weight, the max_weight, and also if the total value, items_val, is greater than any other total value in previous combinations, if both cases are true, then it sets the total value, items_val, as the best value, best_val and tags this combination of items, best_set as the chosen combination. It then returns the best value, best_val, and the chosen combination, best_set, as a tuple.

Now the code to test the main code.


# code to test
def test_best(max_weight=20):
    items = build_items()
    pset = powerset(items)
    taken, val = choose_best(pset, max_weight)
    print('Total value of items taken is', val)
    for item in taken:
        print(item)

Since the weight of the knapsack is 20 kg, I set the max_weight as default of 20. You can specify something else when you are running the code. Then in the body of the function I built the items or collect the items into a list, then create the powerset of all the items. After that, I called the choose_best method to make the optimal choice or solution. Then in the final part of the code I use a for loop to print out each of the items that were in the chosen combination.

Here now is the complete program so you can run it here.

If you desire to study it in-depth and also run it on your own terminal, here is the download file, my_items.py.

That’s it. It’s fun coding the 0/1 knapsack problem. I hope you enjoyed it. I did.

You can leave comments below. Also, subscribe to my blog so that you can receive latest updates.

Happy pythoning.

Python Combinations Function – The Power To Choose

Let’s imagine this scenario. You are a fund manager who is in charge of several stocks. Your company has given you about 20 stocks to evaluate and asks you to find out what 5 stocks from the 20 you can include in your portfolio this year. You have a choice of selecting the 5 stocks which have equal probability of success. How many different selections can you make?

Ever seen a problem like this in college mathematics? Yes, it is an example of a combination problem. We see it all the time in life. In choosing what clothes to wear for the week, what combination of food to choose from a menu, or what combination of channels to watch for the week. We cannot do without combinations.

python combinations function

 

In simple terms, combinations can be defined as the number of possible arrangements you can make from a collection of items where the order of the selection does not matter. Combination is different from permutations because in permutations the order of selection matters.

Let me not bore you with the mathematical details. Let’s go straight to how python allows you to use the power of combinations.

How Python combinations work

To carry out combinations in python you need to import one function, the combinations function from the python itertools module. You can use the code: from itertools import combinations. Very simple. With that you are good to go.

The syntax for the python combinations function is: itertools.combinations(iterable, r) where iterable is the collection you want to select from and r is the number of possible arrangements you want to make from the collection. Note that r should not be greater than the length of the iterable otherwise python combinations function will return an empty object. When you call the combinations function, it returns a combinations object which is an iterator. You can cast the iterator to a list or set to extract the elements of the combination or the arrangements.

Now that the syntax is done, let’s solve the fund manager’s problem we started with.

The problem the fund manager is faced with is that out of 20 stocks he has to select 5 without order since they are all equally probable of success. How many selections or arrangements can he make?

I have included comments in the code above so you can follow along on the logic behind how it was applied. On line 1, we imported combinations from itertools module. That means we are good to go. On line 3, using range function, we created a collection or sequence of 20 items. So easy. On line 4 we called the combinations function and passed the collection or sequence as its first argument and then 5, the arrangements we are making, as its second positional argument. The python combinations function returned a combinations object which is an iterator, and in the next line we cast the iterator to a list so we can extract the items in it. But not to worry, we are not investing in any of the stocks yet, we just want to know how many selections the fund manager can make. So, on the last line we called length function on the list and it gave us the answer: 15504 possible arrangements of the stocks. I bet, the fund manager needs more than a voodoo priest to decide on what arrangement of stocks to choose.

I believe that right now you understand how the python combinations function works. But am not going to leave you without one more example. I so love this one because I use it often on a weekly basis.

For example, Michael loves eating 5 types of foods but he can only choose three of them every day. If the order he chooses each meal is not important, how does he choose. Also, how many choices can he make? This is just easy, right? Let’s do it.

It’s so easy, not so? I believe you can read and follow along with the code above. It’s one of the easiest codes I’ve written this week. If you look at the food choices, you would notice that rice stands out prominently. Well, because order doesn’t matter it makes no difference if rice is at the beginning of a choice or the end for each day. Why you get the print out is because combinations prints out the arrangements based on the order it finds the items in the sequence or collection. If you want a different order, you can sort the food list. Try it out on your machine and see.

Now, let me give you a bonus tip. The combinations function in python makes it possible for you to do a calculation that before now took a very lot of processing to carry out. That is calculating the powerset of a set. Before I discovered the combinations function, I used to calculate powerset of a set based on an algorithm that was of exponential complexity. You get what I mean? It took a lot of time but when I discovered combinations function, all that stress was put to rest.

How to calculate powerset of a set using python combinations function.

The powerset of a set, S, can be defined as the set consisting of all subsets of S, including the empty set and S itself. So, that’s the mathematical definition and that is the result we expect to have in our code.

To get the powerset of any iterable from the combinations function, we will use the following code:


from itertools import combinations, chain

def powerset(iterable):
    s = list(iterable)
    return chain.from_iterable(combinations(s, r) 
                      for r in range(len(s)+1))

Notice that this time we are not only importing combinations but also the chain function. The meat of the code lies in the last line of the powerset function. What is happening there is that using a generator expression we are creating combinations with the arrangements, r, going from 0 , 1, 2… to the length of the iterable. This makes sure we are creating arrangements for every combination of the powerset. The generator expression outputs a combination object which is an iterator. To extract the elements we have to cast it to a chain object, which is also an iterator and then cast the result of the function to a list or any other iterable. The casting to a list was done in the example below. Note that the elements will be arranged in tuples since they are combinations of sometimes more than one object. It’s so elegant. No more lengthy and time consuming code.

Let’s try it with a working example.

The code above will print out the powerset of the list, num. Cool, right?

Experiment with these functions to your heart’s delight. They demonstrate the power of python.

Happy pythoning.

First Walking Microscopic Robots (Nanobots) To Change The World

Although it has been said several times that the future of nanoscale technology with nanobots is immense, each day researchers continue to expand it. Recently, in a first of its kind, a Cornell University-led collaboration has manufactured the first microscopic robot that can walk. The details seem like a plot from a science fiction story.

microscopic robots or nanorobots

 

The collaboration is led by Itai Cohen, professor of physics, Paul McEuen, the John A. Newman Professor of Physical Science – both in the College of Arts and Sciences – and their former postdoctoral researcher Marc Miskin, who is now an assistant professor at the University of Pennsylvania. The engineers are not new to producing nanoscale creations. To their name they already have a microscopic nanoscale sensor along with graphene-based origami machines.

The microscopic robots are made with semiconductor components that allow them to be controlled and made to walk with electronic signals. The robots have a brain and torso, and legs. They are 5 microns thick, 40 microns wide, and 40-70 microns in length. A micron is 1 millionth of a metre. The torso and the brain were the easy part. They are made of simple circuits manufactured from silicone photovoltaics. But the legs were completely innovative and they consist of four electrochemical actuators.

According to McEuen, the technology for the brains and the torso already existed, so they had no problem with it except for the legs. “But the legs did not exist before,” McEuen said. “There were no small, electrically activatable actuators that you could use. So we had to invent those and then combine them with the electronics.”

The legs were made of strips of platinum. They were deposited by atomic layer deposition and lithography, with the strips being just some dozen atoms thick. Then these strips of platinum are capped by layers of titanium. So, how did they make these legs to walk? By applying a positive charge to the platinum. When this is done, negative ions from the solution surrounding the surface of the platinum are adsorbed to the surface and they neutralize the charge. Neutralization makes the platinum to expand and the strips bend. Because the strips are ultrathin, they can bend on neutralization without breaking. To enable three dimensional motion control, rigid polymer panels were patterned on top of the strips. The panels were made to have gaps and these gaps made the legs to function like knees or ankles, enabling the legs to move in a controlled manner with generated motion.

A paper describing this technology titled: “Electronically integrated, mass-manufactured, microscopic robots,” has been published in the August 26 edition of Nature.

The future applications of this technology is immense. Since the size of the electronically controlled microscopic robots is that of a paramecium, one day when they are more sophisticated, they could be inserted into the human body to carry out some functions like cleaning up clogged veins and arteries, or even analyzing the human brain. Also this first production will become a template for the production of even more complex versions in the future. This initial mcroscopic robot is just a simple machine but imagine how sophisticated and computational complex it will be when it is installed with complicated electronics and onboard computers. Furthermore, to produce the robots do not take much in terms of time and resources because they are silicone-based and the technology already exists. So we could see the possibility of mass-produced robots like this being used in technology and medicine to the benefit of the human race. In fact the benefits are immense when one calculates the economics involved.

“Controlling a tiny robot is maybe as close as you can come to shrinking yourself down. I think machines like these are going to take us into all kinds of amazing worlds that are too small to see,” said Miskin, the study’s lead author.

The frontiers of nanobot technology is expanding by the day. With these mass produced robots in the market, I see a solution in the offing for various medical and technological challenges. This is an innovative nanobot.

Material for this post was taken from the Cornell University Website.

Python Map Function And Its Components

Very often, we want to apply a function to an iterable without using a for loop. We saw an example in the python reduce function but the python reduce function doesn’t really fit want we want to do because the reduce function successively accumulates the results. We want the function to be applied to each element of the iterable and be stored separately. To do that, we would use a python map. It is just the right function for the job. In this blog post, I will show you how to use a python map and also its subsequent, a python starmap, along with usage.

python map function

 

What is a python map.

A python map is just a function that takes iterables and applies a function to the elements of the iterables. If there are more than one iterable, it applies the function to the corresponding elements of the iterable in succession. If the iterables are of different lengths, it stops at the shortest iterable. The result of a python map function is an iterator object. That means to get out the results you have to apply another function to the object. Most times, you would cast it to a list or set.

The syntax of the python map function is map(function, iterable, ...) where function is the function you want to apply to the items of the iterable and iterable is the object which contains the items.

Visit this link if you want to refresh yourself on iterators, and this other one on python iterables. They are important concepts in python. I explained them in depth.

Now, let’s take some examples.

We’ll show examples using a single python iterable and then when more than one python iterable is used. First using a single python iterable.

Supposing we have a list of numbers and we have a function that raises a given number by a power of 3. We could use python map to apply the function to each of the items in the list of numbers.

You will notice from the code above that I used the python map function to apply power_func to each of the items of the num_list in line 5. The first time we printed out the object, items_raised, what we get is a map object. The map object is an iterator. So, to get out the elements in the iterator we cast to a list in line 8 and it then extracted each of the items raised.

Now let’s show an example with more than one iterable. This time, we will use two iterables and add their items together.

What we did above is to provide two iterables, num_list1 and num_list2, to the python map function, and then use the function, adder. What map does is take the items at corresponding indices and pass them to adder which adds them together and then provides the result to the map iterator. Then using list function, we extract each of the elements in the map iterator which is then printed out in line 7.

There is also another case I want you to consider if using more than one iterable and the iterables are not of equal length. What a python map does is that it applies the function to the items of the iterable successively until it comes to the end of the shorter length iterable and then it stops. Let’s take an example, this time letting num_list2 be longer than num_list1.

You can see this time that it stops short at the items at index 4 in both lists and ignores the rest of the items in num_list2 because num_list2 is longer.

One thing you need to know is that the python map function falls short when the items of the iterable is a tuple. This is because map takes each of the items as a single element and was not meant to work with tuples. But not to worry, we have another descendant of map, the python starmap function, that helps us to deal with tuples as elements.

What is the python starmap function?

Just like the python map function, the python starmap function returns an iterator based on the operation of a provided function to the elements of an iterable but this time it works when the elements are tuples. Since you know the basic concepts behind python maps, it also applies to python starmaps. The python starmap function is included with the itertools module. So to use it, you first need to import it from the itertools module.

The syntax for starmap function is itertools.starmap(function, iterable) where function is the function carrying out the operation on each of the elements of the iterable. The elements of the iterable are arranged in tuples.

As an illustration, let’s take an iterable of tuples for example.

As you can see from above, in line 1 we first imported the python starmap function from itertools. Then we defined a function, power, that takes in two arguments and raises the first argument to the power of the second argument at lines 3 and 4. Then at line 7 we used the starmap function to apply the power function to each of the elements of the iterable, this time a list, which are tuples. Python starmap unpacks the tuple when sending them to the power function such that the first item in the tuple becomes bound to x and the second item of the tuple becomes bound to y, and then the function is applied to them such that x is raised to power y and the result is added to the starmap object. Then in line 8 we call list function on the starmap object which is an iterator to extract each of the items in the iterator. Then finally, we print them out.

We can use an iterable with a tuple that has any number of items as long as the function to which they would be applied can accept that number of arguments when the tuple is unpacked by the python starmap function. Let’s take another example of a starmap being used with an iterable that has a tuple of three items.

In the code above, the functioin, is_pythagoras, is based on the Pythagoras rule that the square of two numbers in a triangle is equal to the square of the longest side. What is_pythagoras function does is take three positional arguments and checks for the Pythagoras rule in the arguments. If it obeys the pythagors rule, it returns True but if not it returns False. Then in line 9 we created a list of tuples that is structured with 3 items representing the sides of a triangle, with the third item being the longest side. Then we applied the triangles list and the is_pythagoras function together in the starmap function to check which side obeys the Pythagoras rule. You will notice that line 10 produces a list having either True or False as entries. Then in line 11 to 14, we checked which of the entries in the list has the True value and then printed the corresponding entry from the triangles list of tuples as the tuple obeying the Pythagoras rule.

I hope you enjoyed yourself as I did. These functions show you the powerful abilities of python. Use it to your enjoyment. To keep receiving updates from me on how to use python, you can subscribe to my blog. Thanks for reading.

Happy pythoning.

Python List Comprehension and Generator Expression Toolbox

Python List comprehension, sometimes called listcomps, and generator expression, (genexps), were a notation inspired by the programming language, Haskell. Their aim is to make code more compact, faster, and optimized provided the author does not make the code unreadable. Many a programmer has found these two notations extremely useful when they want to write pythonic code.

python list comprehension and generator expressions

 

To give you an idea behind the inspiration of python generator expressions and list comprehensions along with the syntax for forming them, let’s take a python for loop that iterates through a list of items and appends those items to another list based on a Boolean expression.

The for loop above iterates through a list where fruits are weighted 1 or 2 and placed in a tuple. It then filters all fruits that have a weight of 1 and appends them to the weighted list. To introduce you to the syntax of list comprehensions and generator expressions, we will rewrite the code using list comprehension:


fruits = [(1, 'mango'), (2, 'apple'), (1, 'orange'), (1, 'pineapple'), (2, 'melon'), (1, 'banana')]
weighted_list = [ item[1] for item in fruits if item[0] == 1]
print(weighted_list)

You can run it in the embedded interpreter here:

Compare the two outputs and you will see that they generate the same lists. So, now that you have seen a live demonstration of how a python list comprehension is written, let me explain the syntax of python list comprehensions and generator expressions.

Syntax of python list comprehensions and generator expressions.

The basic syntax for python list comprehension is:

[ expression for item in iterable if condition ]

The basic syntax for generator expression is also:

( expression for item in iterable if condition )

It consists of a for statement which could be followed by an optional if statement and then an expression is returned. Notice that they both have the same syntax. The only difference is that python list comprehensions are surrounded by square brackets while python generator expressions are surrounded by parenthesis. Also, what list comprehension does is to return the expressions as a list while generator expression returns an iterator. So, notice this difference between what they both return because it is very important.

If you want a refresher on what an iterator is or what iterables are, just click on the links.

So, having that syntax, let us show examples of common operations you can use with list comprehensions and generator expressions.

Common operations of list comprehensions and generator expressions.

Two common operations that these two notations perform are: (1) To perform some operation on every element of an iterable. (2) Selecting a subset of elements of an iterable that meets some condition.

Most of the operations you will perform with list comprehensions or generator expressions will fall under one of these two broad categories.

  1. Performing some operation on every element of an iterable.
  2. Let’s demonstrate this with python list comprehension examples and generator expression examples.

    Suppose you want to add up all the elements of a range of numbers as they are produced. You could most probably use a generator expression for a compact and optimized code. Here is how:

    
    sum_of_num = sum(x for x in range(1, 21))
    print(sum_of_num)
    

    With the code above I just added all the numbers from 1 to 20, and the output was 210. Note that since the python generator expression produces an iterator, sum function takes each element of the iterator and adds them together. When you have an iterator that needs operations on their elements, just think of generator expressions.

    We could also perform operations on elements of an iterable using list comprehension.

    
    sum_of_num = sum([ x for x in range(1, 21)])
    print(sum_of_num)
    

    Notice that I enclosed the list comprehension inside the sum function. This is because sum function can also take an iterable. A list is what the list comprehension produces which is an iterable. Like the generator expression, the list comprehension produced the sum of num as 210. I want you to study both syntax very well and make sure you understand what I did. Now, for further clarification, let me show you the lengthy for loop that the above codes replaced that took longer lines.

    
    summed = []
    for x in range(1, 21):
        summed.append(x)
    print(sum(summed))
    

    You can see that the python list comprehension and generator expression look more pythonic. One liners are so beautiful. Just compare them and see for yourself.

  3. Selecting a subset of elements of an iterable that meets a condition.
  4. Sometimes we want to select elements of an iterable based on a condition being False or True. List comprehensions are usually the handy tool for that job. When I see problems like this, I chose list comprehensions because generator expressions being iterators usually need a function to help bring out the elements. But I will show you how to do this action using both.

    For example, suppose we have some numbers and we want to select only the even numbers in the list based on whether the numbers are divisible by 2 without remainder. This is how list comprehension could be written for it.

    Run the code above and see for yourself and study the syntax of the list comprehension. You should see a print out of the even numbers selected as a list. Yes, that’s what a list comprehension produces – a list. Now we could do this with a generator expression but because we don’t have a function acting on the elements selected, it doesn’t look that too elegant. We will take note of the fact that the generator expression produces an iterator, so based on the definition of iterators which you remember implements the __next__() special method, we would use the next() method to bring out all the items selected from the range. But since they are ten in number, we would have to call next() ten times.

    You don’t expect me to call the next() method 10 times, do you? So you see, for occasions like this when you just need the numbers without doing any operation on them, a list comprehension would suffice.

So, we have our two common broad operations where list comprehensions and generator expressions are usually used.

So, what are the differences between the list comprehension and generator expression?

Differences between list comprehension and generator expression.

The first difference is that while list comprehension will produce a list, a generator expression will produce an iterator. And remember from the post on iterators, they don’t need to materialize all their values at once unless you need those values.

The next difference is that if you are dealing with iterables or iterators that return an infinite stream or a very large amount of data, list comprehensions can compromise your memory. List comprehensions are not memory friendly in this instance because while creating the list they have to use a large amount of space for large data. So, these is where generator expression trumps list expressions because they are more memory friendly, giving you data only when you need them.

As I explained above, you surround python list comprehensions with square brackets while python generator expressions are surrounded with parenthesis. Just wanted to repeat it again. Same syntax but different environment.

As I showed you above, when you just want to select items, it might be better to use a list comprehension but when you want a function to act on the items themselves, a generator expression works better.

Python List comprehensions and generator expressions support nesting

There are times when you have a nested loop. Python list comprehensions and generator expressions also support nesting. Any level of nesting. But be careful not to nest too deeply that the code becomes unreadable.

Let’s show some nesting examples.

On the code above, I wanted to create of list of names to fruits tuples. It is recommended that if a list comprehension will output a tuple, you should surround the tuple with parenthesis as I did above. This is to prevent ambiguity in your code. This just shows that nesting is possible in python list comprehensions and generator expressions.

Nesting can also involve the if statement but be careful that they are well arranged.

So, that is the guide to list comprehensions and generator expressions. Use them with responsibility.

Happy pythoning.

5 Python Directory Handling Techniques

Directories and files are crucial to a programmer who wants a resource for his programs. That is why it is necessary after discussing python’s file handling methods, one should also undertake an understanding of python’s directory handling methods or routines. In this post, I will describe 5 routine ways one can handle directories using the methods provided in python such as the python make directory method and get working directory methods.

python directory methods

 

To be able to run any of the commands in this post, you first need to import the os module into your interpreter. To do that you use the code: import os.

Getting the python current directory

The current directory is the directory from which the python interpreter is operating. It depends on how you launch your interpreter or your editor. To know your current working directory is easy. You just need to call the python get current working directory method, os.getcwd(). Here is an example:


import os

working_dir = os.getcwd()
print(working_dir)

The code above will make the current working directory to be printed on your terminal.

You might desire to change your current working directory. Maybe you want to do some experiment on some programs and want to run them on a directory you intend to delete later; I do that all the time. Changing the current working directory is easy with python. You use the python change working directory method, whose syntax is: os.chdir(path). It states that you have to provide a path as an argument for the directory you want to switch to. Path should be based on the path specification of your operating system. It is wise to make path a string in all cases.

An example will suffice.


import os

os.chdir('C:\\Users\\Michael\\Desktop\\')
print(os.getcwd())

Notice above that I double escaped the backslash character. This is because the backslash is a special character. When I run the above code, it changed my working directory to ‘C:\Users\Michael\Desktop\’. Also, I am working on a windows 10 computer in case you are using Unix, Linux or Mac.

Creating New Directories with Python

There are occasions you want to create new directories, or what some call make new directories in python. Python can do this very easily when you use the right methods. There are two methods provided in python: the python make directory method, os.mkdir, and the python make directory recursive method, os.makedirs, which acts recursively by creating more than one directory as long as the directories do not already exist.

The syntax of the os.mkdir method is: os.mkdir(path, mode=0o777, *, dir_fd=None) where path is the name of the directory you want to create. You can leave the other keyword defaults as is because on some systems the mode parameter is just ignored and directory file descriptors, dir_fd, are not implemented.

Supposing we want to create a directory called, new_dir, we could try the following:


import os

try:
    os.mkdir('new_dir')
except FileExistsError:
    print('Directory already exists.')
else:
    print('Directory created successfully.')        

I used a try statement to make sure that the directory doesn’t exist before creating it. This is because a FileExistsError is raised if the directory already exists. That gives peace of mind.

The os.makedirs method is also used to create directories but it does this recursively. That means, you can use it to create successive directories. The syntax of the method is: os.makedirs(name, mode=0o777, exist_ok=False). Path is the name of the directory you want to create. It has a different argument though from the python make directory method that is worth mentioning. It has an exist_ok keyword argument which you can set to True if you want to create subdirectories of an already existing directory. Let’s use an example:


import os

try:
    os.makedirs('new_dir\\second_dir\\third_dir', exist_ok=True)
except FileExistsError:
    print('Directory already exists.')
else:
    print('Directory created successfully.')        

If you run the above on your machine, it creates the directories second_dir and third_dir (remember new_dir has already been created) and prints: ‘Directory created successfully.’ This is because I set the exist_ok argument to True i.e it should create subdirectories even where a directory already exists. The exist_ok argument comes convenient.

How to remove a directory in python

The methods under this category come in handy when you no longer need a directory. You can programmatically remove directories using python with the python remove directory method, os.rmdir, and the python remove directory recursive method, os.removedirs. The latter removes directories recursively. I didn’t tell you in the earlier post on file handling, but you can also remove files if you want to using the python remove file method, os.remove. I will describe all three here.

To remove a single directory, you use the python remove directory, os.rmdir, method. The syntax of the method is: os.rmdir(path, *, dir_fd=None). Path is the name of the directory you want to remove. With this method, you cannot remove directory trees or directories that are not empty otherwise it will raise OSError exception. If the directory does not exist, it will raise a FileNotFoundError.

When I wanted to remove the new_dir created earlier with child directories like this:


import os

try:
    os.rmdir('new_dir')
except OSError:
    print('Directory not empty.')
else:
    print('Directory successfully removed.')        

It printed out: ‘Directory not empty.’ That means I cannot remove a directory with child directories using this method. Not to worry, the second method, the python remove directory recursive method can do that: os.removedirs

The syntax of the os.removedirs method is: os.removedirs(name) where name is the name of the directory you want to remove.

In the example below, I wanted to remove all the directories and sub-directories we created when making directories.


import os

try:
    os.removedirs('new_dir\\second_dir\\third_dir')
except OSError:
    print('Directory not empty.')
else:
    print('Directory successfully removed.')        

It ran successfully and printed: ‘Directory successfully removed.’ To ensure it doesn’t raise an OSError exception, you should make sure that the leaf directory, third_dir, is empty i.e it doesn’t contain any files.

Now, let’s show the bonus method on how to remove a file.

The method for removing files is the python remove file, os.remove, method. The syntax is: os.remove(path, *, dir_fd=None). Path is the name of the file. If the file is already in use, the method raises an error. Note that the file name, path, should be relative to the current working directory.

In this example here, I want to remove a file that was used when we discussed the file handling methods in an earlier post:


import os

os.remove('eba.txt')
if os.path.isfile('eba.txt'):
    print('File not removed.')
else:
    print('File removed.')    

It ran successfully and printed: ‘File removed.’

How to rename a directory

We can programmatically rename a file or directory in python. There are methods for both single file or directory, or multiple files or directories. The python rename method, os.rename, works for single file or directory, while the python rename recursive, os.renames, method works recursively.

The syntax for the os.rename method is: os.rename(src, dst, *, src_dir_fd=None, dst_dir_fd=None) where src means the source file or directory, and dst means the new name you intend to give the source. The dst or new name should not already exist otherwise the operation will raise an OSError exception or that of one of its subclasses depending on the operating system used.

Here is an example of usage:


import os

try:
    os.mkdir('new_dir')
    print('Directory created successfully.')
    print('Now attempting to rename it.')
    os.rename('new_dir', 'old_lady')
except FileExistsError:
    print('Directory already exists.')
except OSError:
    print('Couldn\'t rename the directory.')
else:
    print('new_dir changed successfully to old_lady.')    

In the example above, I first created a new directory, new_dir, and when it went successfully without raising an error, I then attempted to rename it from new_dir to old_lady. If old_lady already exists, it will raise an OSError exception which I would handle by printing out: ‘Couldn’t rename the directory’ but if it doesn’t exist already, the renaming would run successfully, (which happened) and then print out: ‘new_dir changed successfully to old_lady.’

Now we can do this recursively. What if we create a directory tree with an empty leaf directory. We would have to use the python rename recursive, os.renames, method.

The syntax of the os.renames method is: os.renames(old, new) where old refers to the old name of the directory or directories and new refers to their new names.

Let’s take an example from above again. This time, we want to rename all the directories and sub-directories.


import os

try:
    os.makedirs('new_dir\\second_dir\\third_dir', exist_ok=True)
    print('Directories created successfully.')
    print('Attempting renaming of the directories created.')
    os.renames('new_dir\\second_dir\\third_dir', 'my_first\\my_second\\my_third')
except FileExistsError:
    print('Directories already exists.')
except OSError:
    print('Couldn\'t rename the directories.')
else:
    print('Renamed all three directories recursively.')                

From the above, you could see that I first created a directory tree, new_dir\second_dir\third_dir, and when it was created successfully, I tried an attempt at renaming all the directories recursively using a second try statement. If you do not have the necessary permissions to rename the directory, then the operation will fail. But if the permissions are available and the directories exist as stated in the names for the old directory, then they will be renamed and the code will print: ‘Renamed all three directories recursively.’

You can be creative and try out your own examples to see how it will run.

How to list all the files and nested directories of a directory

I am using windows, so Linux or Unix users pardon me if my example is Windows based. If on windows you want to list the contents of a directory, you use the command ‘dir’ on the command line and it gives you a listing. You can do the same with python. Python has two methods for doing so: a python list directory, os.listdir, method and an optimized python scan directory, os.scandir, method.

It is recommended that you use the python scan directory, scandir, method for most cases, but let me show a working example of the python list directory method. The syntax of the python list directory method is: os.listdir(path='.') where path is the name of the directory whose contents you want to list. The path parameter is optional and where omitted, it defaults to the current working directory. The method returns a list of all the files and directories that are contained in the directory named path.

Here is an example:


import os

dir_list = os.listdir()
for file in dir_list:
    if os.path.isfile(file):
        print(f'{file} is a file.')
    else:
        print(f'{file} is a directory.')    

The above code first returns a list of all the files and directories in the current working directory as dir_list. Then I iterate through the list in a for loop and print out whether an item is a file or a directory. This gives you a listing that is similar to the windows ‘dir’ command line .

Now, for the optimized python scan directory, scandir, method. The syntax of the optimized scan directory method is os.scandir(path='.') where path is the name of the directory. Scandir returns an iterator which yields objects that correspond to the files or nested directories in the path name. You can return the object name, whether they are files or directories, from the objects yielded. (If you want a refresher on iterators or on python generators that yields objects then click on the corresponding links.). Having objects that have file types and attributes increases code performance provided the operating system can provide this information.

Since the iterator produced by the python scan directory method is a resource, you need to close it or garbage collect it by calling the close method, scandir.close(), but you could do this better by using a with statement.

In the example below, we will list the contents of the current working directory again, but this time showing how to do it with the python scan directory method working as a generator.


import os

with os.scandir() as my_dir:
    for item in my_dir:
        if item.is_dir(follow_symlinks=False):
            print(f'{item.name} is a directory.')
        else:
            print(f'{item.name} is a file.')      

I used the with statement so that python will automatically close the iterator immediately the operation ends. You will notice that each object, item, yielded also has attributes of their own. In this example, item object has name attribute in item.name, and also the is_dir method in item.is_dir. This is because the objects are os.DirEntry objects. In the item.is_dir method, in order not to follow symbolic links and list a directory having files as a file, I switched the follow_symlinks parameter to False. This makes it possible to accurately get all directory listings.

Now you have been equipped to use python’s directory handling functions. Go experiment with what you can do with them.

Happy pythoning.

7 Important File Handling Functions In Python

Computer files, or resources for recording discrete data, are usually ubiquitous in python. File handling in python treats files as either textual or binary files and there is no limit to the size of files python can work with. In this post, we will be discussing textual files while in subsequent posts we will discuss binary files and how python handles them. Seven basic functions for handling textual files are discussed.

python file handling functions

 

The Built-in Python Open File function

The built-in python open file function is the first function you will encounter when you want to open any sort of file in python. It is used to open a file and it returns a file object. The syntax for the python open file function is open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None) but for working on textual files, we will focus on the parameters file and mode.

The file parameter to the open file function represents the pathname of the file absolute or relative to the current working directory. The pathname depends on the file system of the operating system. If the file is not found on the call to open file, the function returns a FileNotFoundError exception.

The mode parameter specifies the mode in while the file is to be opened. The default mode is ‘r’ which means open for reading text. Other values are ‘w’ meaning open for truncating and writing to the file, and ‘a’ meaning open for appending to the end of the file. Other modes are ‘b’ meaning open binary file and ‘+’ meaning open for updating (reading and writing). If you want to write to the file without truncating it, then use the combination mode, ‘r+’, which just writes to it at the beginning of the file. If you want to write starting from the end, then use ‘a’.

Now, let’s use some examples.

I will be working with the following text file, eba.txt, that is in my working directory. The contents of the file are:

Nothing beats a plate of eba
as we generally want to eat it
but it often gets stuck in our throats
where it is eaten without a good soup
that is oily and makes the eba
which is a gelly and very hard 
to move smoothly down our throats

You can also download the file here or copy and paste it if you want to use it so you can get the same results as I did.

Now, let’s just open the file without doing any reading or writing. Those functions will come later. After opening the file, we will then close it. It is good practice to always close your files.


try:
    fobj = open('eba.txt', 'r')
except FileNotFoundError:
    print('File doesn\'t exist.')
else:
    print('File opened successfully and file object created.')
finally:
    fobj.close()            

A more pythonic way to do the above, that is, open the file resource and then close it automatically would be writing the following line of code:


with open('eba.txt', 'r') as fobj:
    print('File opened successfully.')

If the file opened successfully, the print will run but if not, you will get a stack trace about a FileNotFound Error.

So you now know how to open textual files. What remains is to do something with them while they are open. The remaining functions will deliberate on that.

The Python Read File Function

The syntax for the python read file function is read(size=-1) where size specifies the number of bytes to read from the file. The default is -1 which means read all the contents of the file as string of characters (we are dealing with textual files here) and return all the contents of the file. If size is specified, then it returns the number of size of the characters from the file. If size is not specified but empty, the python read file function returns all the characters from the file. Be careful when using this feature because if the file is very large, it could interfere with your system memory.

So, for some examples. Remembering we are using the eba.txt file which contents I posted above.

Suppose we want to read only 20 bytes from the file. We will use this code:


with open('eba.txt', 'r') as fobj:
    s = fobj.read(20)
    print(s)

Our output would be:

Nothing beats a plat

Just the first 20 bytes in the file. Later, I will show you how to read from any position with random access to the file.

The Python Readline function

The syntax for the python readline function is readline(size=-1) and unlike read function, it reads and returns one line from the stream. It starts with the first line. You can customize it by specifying size and then the number of bytes in size will be read. The end of line is usually determined by the newline character of the python open file function. The default is to resort to the system defined newline character. For most implementations, the default newline is okay.

Now, to use some examples to illustrate. Imagine we had this code to just read the first line from the text file given.


with open('eba.txt', 'r') as fobj:
    s = fobj.readline()
    print(s)

The output we will get on the terminal is:

Nothing beats a plate of eba

This is the first line of the eba.txt file.

You will notice when you run the above that a new line is printed for each line. You could remove that new line which was created when ‘\n’ was encountered by calling the strip function on the string object returned by the python readline function. Compare the output for the code below using strip function and that for the code above without the strip function on your machine.


with open('eba.txt', 'r') as fobj:
    s = fobj.readline()
    print(s.strip())
 

Using the strip function on the string now strips away the added newline and gives a more beautiful rendering.

The Python Readlines Method

The python readlines method is in plural because it reads multiple lines. The syntax for readlines is readlines(hint=-1) which states that the readlines function reads and returns a list of lines from the stream. The hint parameter is to tell the python readlines function how many lines to read if you want to customize it but the default is to read all the lines and return them as a list. Please, use this function carefully. In fact, if your file is very large, it could have detrimental effect on your system memory. This is because to return the lines, it first needs to create a list of all the lines and this takes memory space.

An example to show how the readlines method works.


with open('eba.txt', 'r') as fobj:
    s_list = fobj.readlines()
    print(s_list)

Which gives the following list as output:


['Nothing beats a plate of eba\n', 
'as we generally want to eat it\n', 
'but it often gets stuck in our throats\n', 
'where it is eaten without a good soup\n', 
'that is oily and makes the eba\n', 
'which is a gelly and very hard \n', 
'to move smoothly down our throats']

It is recommended that you avoid using readlines because there are other ways to go about reading all the lines from your files without impacting on memory. One of them is to use a python for loop to iterate through the file object. This is because a python file object is already an iterable.

The above could be achieved with the following for loop code:


with open('eba.txt', 'r') as fobj:
    for line in fobj:
        print(line)

We have been reading and reading from files. Now, we want to write to files. We will now use the python write to file method.

The Python Write to File Method

The syntax for the python write to file method is write(s) which specifies writing the string, s, to the file and returning the number of bytes written.

The ability to write to the stream or file depends on whether it supports writing. To make this possible, we need to specify this support when opening the file and creating a file object. This is made possible by specifying the writable mode on the open file function (the open file function was explained above). The writable modes are:

r+ Update the file i.e read and write to the file. When the write function is called, it writes the specified string , s, to the beginning of the file.
w It truncates the file first and then writes the string s to the file. You lose all your former file contents with this mode.
a Append the string, s, to the end of the file. It writes onto the last line. If you want it to write to a new line at the end, you need to add a newline character at the beginning of the string, s.

Now, compare the following codes on your machine and see how they run:


with open('eba.txt', 'a') as fobj:
    s = fobj.write('This line was written.')

with


with open('eba.txt', 'w') as fobj:
    s = fobj.write('This line was written.')

and with:


with open('eba.txt', 'r+') as fobj:
    s = fobj.write('This line was written.')

You will notice that the way contents of the file, eba.txt, was written to differs based on the specified mode of the open function. The python write to file method is one of the methods you will most often use when working with files.

The Python seek function

With this method, you can change the current stream position so that when you call the python read file or python write to file methods, it doesn’t carry out those operations from the start of the file which is the default. The syntax for the python seek method is seek(offset, whence=SEEK_SET) where offset is the position you want the stream to go to. Seek method returns the current position of the stream.

For example, if you want to read the eba.txt file from the 35th byte or character in the file and then output the next 55 characters or bytes, you could change the current stream position using seek to be 35 and then do a read with size 55. Here will be the code:


with open('eba.txt', 'r+') as fobj:
    num = fobj.seek(35)
    s = fobj.read(55)
    print(s)

The output you would get from the eba.txt file is:

generally want to eat it
but it often gets stuck in ou

Showing just those 55 bytes of characters.

The last method we will consider is truncate.

The Python truncate file method

With the python truncate file method, you are able to change the size of the file. The syntax for the truncate file method is truncate(size=None) where size is the new size of the file. Where size is not specified, the file is truncated from the beginning of the file to the position of the stream. If size is lesser than original file size, the file is truncated but if higher than original, the size is extended and the unfilled areas are filled with zeros. For the python truncate file method to be operational, the file must support updating or writing, which you have to do by making the file open in writable mode as described above.

The truncate method acts like the write method.

So, I have given you ideas on what you can do with your files and file objects. The next post will be on how to handle python directories. Please, watch out for it. And subscribe to this blog so you can get regular updates when I post new articles.

Happy pythoning.

Matched content