Search

Showing posts with label coding. Show all posts
Showing posts with label coding. Show all posts

Python Graph Traversal Algorithms: DFS and BFS algorithms

Graph traversal (or graph search) is an important concept in computer science and programming. It refers to the process of visiting each node or vertex in a graph and checking or updating them. The algorithms that denote how this visitation are done are used to classify the type of traversal. I will be considering a python depth first search algorithm and a python breadth first search traversal of a graph in this post by applying them to finding the shortest path in a graph.

graph traversal algorithms

 

I believe that you are familiar with the graph classes I created in an earlier post. Because I will be using the objects of those classes as the model for the traversals we are going to be doing. If you are not familiar with the classes, I will encourage you to check the digraph and graph classes here. It only takes a minute or two.

Graph traversal has many applications in programming. It could be used to find all the nodes or vertices in a connected graph, find the shorted path between two nodes, test a graph for bipartiteness, or even analyse relationship and networks in a graph.

In this tutorial, we will be looking for the shortest path between two nodes using our algorithms.

The Python Depth First Search (DFS) traversal algorithm for shortest path in a graph

This depth first search algorithm requires that each node is called on exactly once for each connection on the graph. The algorithm is a recursive algorithm that ends when we find the end node in the path.

The search works by starting from a start node, and then it looks for all its adjacent nodes (i.e nodes connected by an edge), and it iterates through each of these nodes, also looking for their adjacent nodes recursively until it gets to a node without an adjacent node in the directed graph, or it gets to the end node. I modeled it as a directed graph because searching is directed in order to avoid cycles.

The graph below illustrates how graph search works in depth first search traversal. 

depth first search example
By Mre - Own work, CC BY-SA 3.0

Now, to implement the algorithm, we will be using the graph below. This is a directed graph with 5 nodes and 6 edges.

directed graph example


From the graph above, we will be taking a path from node A to node E.

You can run the code below to see how it works. The explanation of the relevant parts comes after. You can download the code for use on your machine at the last paragraph of this post.

If you read the code, you can see that the graph is modeled after the Digraph object, digraph, which explanation for the Digraph class is contained in the link I posted some paragraphs before this. The new parts of code that needs to be explained are the print_path function, the DFS function, and the driver code.

The code for the print_path function is found from lines 67 – 74.

    
def print_path(path):
    '''assumes path is a list of nodes'''
    result = ''
    for i in range(len(path)):
        result = result + str(path[i])
        if i != len(path) - 1:
            result = result + '-->'
    return result 

The code prints out each path or set of connected nodes while going through all the nodes in the graph searching for the end node. It uses a loop that iterates through the list of nodes in the path and concatenates each node with an arrow to represent relationship. It stores the name of each node in a variable, result, and prints out result at the end of the function.

The most interesting aspect of the code is the DFS function which does the actual python depth first search traversal. Here is the code from lines 76 to 90.

    
def DFS(graph, start, end, path, shortest, to_print = False):
    '''Assumes graph is a Digraph; start and end are nodes; 
    path and shortest are lists of nodes
    Returns a shortest path from start to end in graph '''
    path = path + [start]   #creation of visited nodes list
    if to_print:
        print('Current DFS Path:', print_path(path))
    if start == end:  # base case and end of the recurisve loop
        return path
    for node in graph.children_of(start):
        if node not in path:  #avoid cycles 
            if shortest == None or len(path) < len(shortest):
                new_path = DFS(graph, node, end, path, shortest, to_print)
                if new_path != None:
                    shortest = new_path

As you can see from the code above, the function, DFS, takes 6 arguments: the graph to search, the start and end nodes, the path of visited nodes, a shortest list of nodes that is the shortest path to the end node from the start node, and a keyword argument, to_print, that states whether you want a print out of the nodes as they are visited. The path argument refers to the visited nodes in the graph. We need to keep a count of visited nodes so that we don’t go through the same node more than once because that would mean going in a cycle. Then the shortest argument is the shortest path found so far starting from the start node. When we first called the DFS function, the shortest path was given the value None to indicate that we have not found a shortest path yet.

We indicated the base case for the recursive loop in an if statement that checks to see when the function is called recursively whether the start and end loops are the same. This makes sure the loop ends when we have arrived at the end loop and we can create the shortest path. This is indicated in lines 83 and 84. Then from lines 85 – 90, using a for loop when the traversal gets to each node we get the children of that node in the loop. First, we check to see whether that child has been visited before and if not we check to see if it will lead to the end node and if it does if the length from the start to that node is shorter than the shortest so far. If it is longer, we stop that search and go to the next child in the loop but if shorter, we recursively call the function again and where it returns a path to the end node, that becomes the shortest so far.

The function is guaranteed to return the shortest path if it exists.

Now, let’s go on to the second traversal algorithm, the python breadth first search (BFS) algorithm.

The Python Breadth First Search (BFS) traversal for shortest path in a graph.

Unlike in the DFS algorithm where we looked for the shortest path by first analyzing the child nodes, in the python BFS algorithm, we first explore the sibling nodes while looking for the shortest path. Sibling nodes are nodes with the same parent. We use a queue to understand which nodes have been visited and which has not been visited. The BFS algorithm guarantees that whenever it gets to the end node, that path discovered is the shortest path and other nodes do not need to be analyzed after that (unlike in DFS where you analyze every other node for length of path). Many programmers prefer to use the BFS algorithm for finding shortest path in a graph.

Now this is the code for the python BFS algorithm. Run it to see how it would work before I provide the explanation for new code. You can download the code for the python breadth first search (BFS) algorithm at the last paragraph of this post.

Other codes and functions are familiar except for the BFS function, so I will explain it in depth. The lines reference are from the embedded code above.

    
def BFS(graph, start, end, to_print = False):
    '''Assumes graph is a digraph; 
    start and end are nodes. 
    Returns a shortest path from start to end
    in graph '''
    init_path = [start]
    path_queue = [init_path] #store all the nodes currently being explored
    while len(path_queue) != 0:
        # get and remove oldest element in path_queue
        tmp_path = path_queue.pop(0)
        print('Current BFS path:', print_path(tmp_path))
        last_node = tmp_path[-1]
        if last_node == end:
            return tmp_path
        for next_node in graph.children_of(last_node):
            if next_node not in tmp_path:
                new_path = tmp_path + [next_node]
                path_queue.append(new_path)
        
    return None                     

The BFS function takes 4 arguments: the graph to be searched, the start and end nodes, and a keyword argument about whether to print out the visited nodes. On line 82 we create a queue of all the sibling nodes for each node that is currently being visited. This queue has the start node as its first node. On lines 88 and 89, we check if the last node in the current path being explored is the end node, if it is we have ended our search, but if it is not, then we move on to the next lines in the code.

On lines 90 – 93 we explore the child paths of the last node and add all the sibling nodes together to the path queue which will be explored in each iteration until we get to the end node.

Note that the BFS function returns the path from the start to end node if they are connected or if not, it returns None.

The DFS and BFS functions return the shortest path where they exist even while working differently. But the BFS is guaranteed to be faster in its implementation and does not go through needless iterations (unlike DFS) when the end node has been discovered because any path it finds to the end node promises to be the shortest path.

That’s it pythonistas. I hope you enjoyed my code for today. I had a swell time writing the code.

Drop a comment if you enjoyed it and say what you think needs to be changed. I would surely love to read your comments. Also, subscribe to my blog so you can receive regular updates when I post new articles. You can download the DFS code, depthfirstsearch.py and BFS codes, breadthfirstsearch.py, here.

Happy pythoning.

A Guide To Python Functions

When we write code we often have groups of related statements clustered together. It would be bad practice to continue repeating those statements. Good programmers group them together in a function. So, we will define our function as a group of related statements that perform a specific task.

python functions

 

Before we delve right into showing how functions are written in python, let’s explore the advantages of using functions in python.

Advantages of using python functions

Using functions rather than just clumping our code statements from line to line has several advantages. I will highlight three:

1.They make for code reuse – reusability.

Take for example we want to multiply a sequence of numbers 5,3, 4 etc. Writing code this way is foolhardy:

    
x = 5
y = 3
z = x * y
m = 4
n = z * m

It stresses the programmer’s resources. With a python function, we could do the multiplication any number of times with any number of objects. Like this:

    
def mult(x,y):
    return x * y

z = mult(5,3)
n = mult(z, 4)    

Code reuse becomes very important when we are dealing with a program that contains several lines of code. It would help for readability and understandability using a function.

2. They make the program manageable - modularity

Taking the example above. You could see that by breaking the statements into smaller chunks, we can manage the code. We could prevent the code from running off into several lines that would make a reader lost about the direction of the code.

When you have several functions working together, they exemplify the concept of decomposition in computer science. For example, for a laptop to function there are different parts. Take the screen, the keyboard, the charger, and the mouse as different parts exemplifying the functions of the laptop. When they work seamlessly, they achieve the idea of decomposition that makes functions a great idea for good programming practice.

3. It helps avoid errors in the code.

Using functions in coding can prevent errors, especially when you have several names that have to be defined. Python functions define their own namespace so that variables do not conflict with one another.

Now, how do you define a python function.

Python function definition.

The syntax for a python function is:

    
def function_name(parameters):
    ‘‘‘docstring’’’
    statement(s)
    return statement

The first line from the syntax above is called the function signature. The signature starts with the keyword ‘def’. The keyword ‘def’ tells the python interpreter that this is a function. Then next is the function name. The function name is unique to the function. You cannot start the function name with a number. It follows the rule of python as for writing identifiers. Then following it are the parameters of the function enclosed in parenthesis. Function parameters are the arguments the function accepts. We have discussed function parameters and arguments in another post. Following the function signature is the block of code that defines the body of the function. A colon ‘:’ is used to separate the signature from the body of the function.

The body of the function consists of docstrings which represent the documentation of the function followed by the statements that the function has to execute. Note that the documentation or docstrings are optional but it is good programming practice to include them in your functions. The documentation is enclosed by three quotation marks. Then the statements to be executed comes after the documentation. The last statement in your function should be an optional return statement. This ends execution of the function. Note that the return statement can occur anywhere in the body of the function but is usually found at the end and when they occur the function stops execution and control transfers to the caller of the function.

The function’s docstrings are a great idea for abstraction. That means, using the docstrings alone, you do not need to know how a function works to use it. The documentation or docstrings tells you what it does and it guarantees you that it works as the author states it.

When a function does not have a return statement, then that function returns None.

Let’s look at a typical function that includes all the syntax for a function. The docstrings or documentation in the function below explains what the function does.

    
def count(data, target):
    ''' Counts the number of occurrences of target
    value within data, an iterable
    Returns the number of occurrences as an int '''
    n = 0
    for item in data:
        if item == target: # found a match
            n += 1
    return n

The code above illustrates the typical features that can be found in a function.

Let’s look at another example that shows that a return statement can be placed anywhere in the body of a function but when it is encountered, it automatically returns control to the caller of the function.

In this contrived example, you can see that if the user enters Michael, the function ends at the return statement on line 7 but if some other name is entered, the return statement at line 9 ends the function and returns control to the caller of the function which is the print statement on line 12.

Now we have been talking about the caller of a function. Let’s us define how to call a function.

How to call a function.

The three ways you can call a function is either from the python prompt, from another function, or from within the program. To call a function, you write the name of the function followed by the parameters that you are passing to the function.

I will show examples where you can call a function from another function and within the program. They also apply to the python prompt.

a. From within the program.

In the function, mult, above we called the function on line 4 when the name of the function was used and then the parameters we want to pass to it were included in parentheses. Note that the parameters must correspond to the arguments of the function in the function signature. For a refresher on function arguments, you can see this post on positional and keyword arguments in python functions.

b. From within another function.

Functions are first-class citizens in python so you can call them from within another function. They can also be parameters to another function.

Take this example of a function calling other functions. Run it to see how it works.

The third function calls on the first and second functions to print their statements.

Also, see how the enter_name function below uses another function, print_name, as its argument or parameter, and then in its body it calls the function that was passed as an argument. The f formal argument is bound to the print_name actual argument that was passed when the enter_name function was called.

The function, enter_name, accepts another function as argument which was passed to it on line 8. It then calls the function in its body and that function when called prints out its statement.

These shows that functions are first class citizens and you can call them from even another function, and not only from within the program.

Variable scope and lifetime in functions.

The scope of a variable can be defined as the portion of a program where the variable is recognized. When a function is entered, the python interpreter creates a new scope or environment which contains the variables that the function recognizes. When a scope is created, names are being mapped to their values. Some of the variables that are scoped on function call are the parameters that are passed to the function. When the function is defined the parameters are called its formal parameters and when the function is called, the parameters are called actual parameters. When the function is called, a new scope is produced where the formal parameters are bound to the values of the actual parameters within the function namespace.

Also, other variables that are scoped in the function are any variables that are declared inside the function.

Take the following code for example:

    
def mult(x,y):
    ''' Returns the multiplication
    of its arguments, x and y, 
    where x and y are ints '''
    z = x * y
    return z

n = mult(2,3)    

When the function, mult, is called at the last line, python creates a new environment or scope for the variables that are recognized within the python function. First, it recognizes the formal parameters, x and y. It then binds x to its corresponding actual parameter, 2, and binds y to its corresponding actual parameter, 3. It then recognizes z that is declared within the python function. It multiplies the numbers and binds z to the value of the multiplication. So, we have 3 variables scoped within the function: x, y and z namely.

The variable n is outside the scope of the function and is said to be in a global scope.

Note that all the variables that are scoped in a function exists as long as the function is running. When the function stops running or returns, python garbage collects all the variables scoped within that function.

To see how this garbage collection of variables is done, let’s take an example.

You can see that the x in the function scope is now 30 when you run it, but the x variable in the global scope is still unchanged at 5. This is because when the function was exited the x in the function scope was destroyed and ceased to exist except for the x variable in the global scope. That shows you that the lifetime of variables in a function’s scope are for as long as the function is still running. Take note of that although there is an exception - functions defined as python generators which yield values. The following blog post explains python generator functions where the variables within its scope are not garbage collected.

One other point I want you to note about scopes is that even if a variable is not scoped within a function, that variable can still be accessed from inside the function but it cannot be modified from within the function.

Example: Global variables, x and y, accessed from within the function, mult.

Next example: We would have an UnboundLocalError if I try to modify either x or y without scoping them to mult.

I hope you now have all it takes to start writing your own functions in python and experiment with its powerful capabilities.

Happy pythoning.

Understanding and Converting File Sizes (Bytes, KB, MB, GB, PB)

Most times we want to understand what file sizes our data has. For example, when the file sizes are stated in bytes, we would want to understand what are the stated unit is in kilobytes, or gigabytes etc. Well, this can easily be coded in python.

python file sizes

 

But first, what are file sizes?

A Files size is a measure of how much data a computer file contains, or alternatively, the measure of the storage in memory it consumes. All file size units have the byte as the unit of measurement and all other units are derived from the byte.

A byte is a sequence of 8 bits processed as a single unit of information. A byte is enough to represent one unit of information. A bit is either an ‘on’ or ‘off’ position in the computer memory where “on” is represented by a ‘1’ and an “off” as ‘0’. So a byte contains 8 of such 1s and 0s. The computer processes our information as bytes.

In python, characters in strings are stored as one byte although extra bytes are allocated for storing any character so if you call sys.getsizeof(char) you may get more than one byte.

Larger bytes can be subdivided further into kilobytes, megabytes, gigabytes, terabytes, petabytes etc. The table below shows the different file units and their comparisons.

1024 bytes 1 KB
1024 KB 1 MB
1024 MB 1 GB
1024 GB 1 TB
1024 TB 1 PB

Where KB means kilobytes, MB means megabytes, GB means gigabytes, TB means terabytes and PB means petabytes.

Using the metrics above, given any number of bytes we can convert to other metric units.

I wrote a little program below that does just that. Given any number of bytes, it can convert the byte measure into larger sizes provided it is possible.

I hope you do enjoy it. I wrote it based on a request from a friend.

Happy pythoning.

How To Split A String In Python: Python Split() Method

Very often we have a long string and we want to split it. As programmers we encounter this situation often. Python has a method built in to the string data type, the python string split method, that can be used to split strings conveniently. In this post, I will describe how to use the python string split method to split strings and also the different ways you can split a string. I also describe how you can create a dictionary from a split string.

python string split

 

What is the python string split method?

The python string split method is built into the string class and has the syntax str.split(separator=None, maxsplit=-1). What the method does is take a string, represented by the name, str, and then split it based on the separator specified in the arguments. If no separator is specified, it defaults to white space. It then returns a list of the elements of the string as items of the list. The maxsplit argument specifies how many splitting has to be done.

We will cover examples for all the scenarios above shortly.

Here is how it works in practice without any argument specified.

In the example above, I did not specify any argument so it defaulted to splitting the string, string, using whitespace as the delimiter and then splitting all the available white spaces.

Do you know? You can split the string and then join them back again to get back your string? Here is an example where I joined them using the dash character.

Now, let’s discuss each of the argumernts to the python string split method.

The separator argument.

As I said above, if separator is not given, the string is split based on whitespace characters. But if separator arguments are provided, the string is split based on the separator provided in the argument.

In the code below, the comma character is the separator.

Notice that the comma character delimits each of the strings and stores them into a new list using the python string split method.

In the example below the ‘#’ character is the separator used.

If we have a string specifying an email address, we could use the ‘@’ character as the separator.

This gives you an idea of how the string split method works with the separator. Note that when you split an empty string with this method, it will result in a list with the empty string as item.

The maxsplit argument.

The maxsplit argument specifies the maximum number of splits that has to be done on the string.

The default is -1 which means there is no limit to the number of splits. When the maxsplit argument is specified, the result list has maxsplit plus one items i.e if the maxsplit is 2, the items in the resultant list are 3.

Now let’s use examples to show how the maxsplit argument works.

The code above specifies a maxsplit of 2 i.e split the string according to whitespace character twice. Notice that the number of items in the resultant list is 3. It only splits using two white space characters and leaves the remaining white space untouched.

If maxsplit is not specified, the default of -1 is used by the method which signifies split the maximum number of times. Now let’s use the same example but not specifying it.

You will notice that the string is now split the maximum number of times i.e by all the whitespace.

Now that you know how the arguments work, go experiment with them and play with python.

There is a tweak I want to show you. How to build a dictionary using the python string split method.

How to make a dictionary using the python string split method.

Very often we want to be able to use the items in a string to make a dictionary. This is simply done by casting the returned key and value pairs to a dictionary. Consider the example below:

If you read the code you will notice that I first split by the semi-colon character, ;, which separates each of the key-value pair. Then for each key-value pair in the resulting list, I split by the equal sign, =, and cast the result into a dictionary to get my dictionary data structure. Just beautiful.

Now, you have the tricks and treats on how to use the python string split method. Go use it with pleasure.

Happy pythoning.

Python Regex For Mobile Phone Number Validity Check

Because there was a huge response from readers towards the email validity check and Roman numerals validity check algorithms I wrote using python regex, I decided to write another common place validity check – mobile phone numbers validity check. Checking mobile phone numbers for validity promises to be easier than other validity checks.

python regex mobile phone validity check

 

To understand the code I will be using, I recommend you read the following earlier blog posts that discusses python regex syntax and methods: “How to find a match when you are dating floats" which is an introduction to python regex coached in story form, and “The Big Advantage of Understanding Python Regex Methods” which discusses three methods you will always use when looking for matches in python regex. At least, you will use one of the three.

Now that we have the fundamentals out of the way, let’s start coding.

Now the rule I will be using for valid mobile numbers are that: A valid mobile number is a ten digit number starting with a 7, 8, or 9. Simply that. I know you can do it on your own after reading the two earlier posts above. I know you can.

But would you like to see my implementation? Here it is below with explanations. The embedded python interpreter would ask you to input a mobile number that would be used for validation.

The really interesting part I think I should explain is the pattern. Other parts of the code should be clear to you but if they are not clear, check the links above. Now for the pattern (see line 3) I first stated that it starts with either a 7, 8, or 9 using a set notation for python regex, ^[789]. Then after the starting digits, there should be 9 digits after that and only 9 digits after, nothing more nothing less. This notation nails it: \d{d}$ with the $ signifying that the last digit is the end of the pattern. That’s that.

Note: In mobile numbers there are other rules like adding a + before the number or an 0. For the sake of simplicity in this post and helping you to get a hold on the fundamentals, I restricted my pattern to only the rule that it is a ten digit number starting with 7, 8, or 9. If you want to validate for additional rules, then experiment with python regex and send me your code about what you did. I would be happy to take a look at it. Remember, programming is all about being creative in problem solving.

Happy pythoning.

Python Sort and Sorted Functions Explained

Sorting in programming is very important. It can improve the efficiency of your code. For example, if you have to search for an item in a collection, the ability to sort the items beforehand reduces the computation that has to be done by the search algorithm, thereby increasing efficiency. That is why understanding how to sort is very important. Python provides two functions for sorting: the python sort and sorted functions. I will describe both functions in this post. Also, I will discuss their similarities and differences, and using examples show you how to use them effectively.

python sort and sorted functions

 

So, let’s start with the first method, the python sort method.

What is the python sort method?

This method is made only for the list data type. It sorts items in place and uses lexicographic order. The items in the list must be able to compare equal and if they do not the sorting fails and raises an exception. The list might be left unstable if this happens. The syntax for the python sort method is: sort(*, key=None, reverse=False). The keyword argument, key, refers to the comparison function that could be used for sorting and the default is None. The reverse argument could be switched between True and False to state whether the sorting should be done in ascending or descending order. The default is False i.e ascending order. The method returns None which means the list is sorted in place.

See this post for a refresher on how sorting using lexicographic order is done.

Let’s give examples of sorting that is done without specifying the key and reverse arguments. We will deal extensively with those two after the sorted function is explained.

The list above, items, has both strings and int values. Since strings and ints cannot compare equal for ‘<’ comparison, the code returns a TypeError. Note this please when sorting.

In the code above the list, items, is a list of numbers and on calling python list sort they are correctly sorted in ascending order, the default. Notice that the list, items, is sorted in place and returns None.

Now, let’s illustrate the second sorting function, the python sorted function.

What is the python sorted function?

The sorted function is a built in function that sorts any iterable in python. The syntax of the sorted function is: sorted(iterable, *, key=None, reverse=False). Unlike the python sort method which acts only on lists, the sorted function can accept lists, dictionaries, strings, tuples, sets etc. It accepts anything that is an iterable. The key and reverse arguments are the same as for sort method and they will be explained below. The python sorted function returns a sorted iterable.

Now for some examples using the same list, items, I used for the sort method.

Now let’s explore the differences and similarities between the two functions.

First, the difference between python sort and sorted functions.

The fundamental difference between both of them is that sort modifies the list in place, while sorted returns a new sorted iterable. So, if you want something that is optimized for lists, just use the python sort method and you are good to go. But if you want to sort an object that is not a list, then you have python sorted function at your convenience.

The similarities between python sort and sorted functions

The similarities between both functions are based on their keyword arguments: key and reverse. The key and reverse arguments for both functions work similarly and can be interchanged. These two keyword arguments give both functions their power so I will take time to explain each of them in turn.

The key argument in sort and sorted.

The key argument, when present, specifies how the comparison is to be done. The key argument is supposed to be a function that takes a single argument and returns a key for the python sort or sorted functions.

Most times when people have items that are lists of lists, they would want to sort based on one of the indices in the list. This is where the key function really comes in. Let’s take a list of tuples for example, of names and ages, and sort based on the ages. This will demonstrate how the key argument can be used. Please I used a list and the sorted function for the examples that come next. You can use any iterable of your choice and either sort or sorted; you will get the same results.

Notice that the youngest person now comes first, followed by the second youngest and then the next etc. So, we specified the key using a lambda function and that the key to use should be the index 1 for the items in the list and index 1 specifies the age. So, we’re sorting the python list based on its indices. Notice though that the list is not efficiently sorted. It was sorted by ages but the names for the same ages are out of order. We will come to that later.

See the following post if you want a refresher on lambda expressions as used in the code above.

Now a list is a built-in data type. Can we do the same sorting on custom objects we created? Yes, we can. Let’s take an example.

In the code above we created a Person class and all instances of Person have a name and age. Then in the driver code, from line 16, we created a list of Person instances and then sorted the list using a lambda expression with the age as the key. I want you to study this code very well and see that we could sort based on specific attributes of objects just as we did for native data types. It shows you the powerful capabilities of python as a language. We can even sort python objects of any type.

But the sorting is not yet efficient. The names are not in alphabetical order; just the same efficiency problem for the first sorted list. So let’s make the sorting efficient.

Please compare the output of the code below with that of the code above.

You will notice in the code above that it is now optimized. Initially, we were able to sort correctly for ages but when two Persons have the same age their names were not sorted. So, I added a little tweak to the lambda function so as to sort first for ages and then for names. I modified the statement in the lambda function to: key=lambda x : str(x.get_age())+x.get_name(). What the code says is to tell sorted to first sort by age with the key cast to a string to make it compare equal to name, and then after sorting by age, then sort by name.

It’s now elegant and more efficient, not so? It’s fun. That’s python programming.

Now we have been dealing with an iterable that has some order to it. What if we have a dictionary, an iterable, that has no order to it. How can we sort a dictionary by value or sort a dictionary by key in python?

First note that to sort a dictionary you only use the python sorted function. And by default it sorts the dictionaries by keys. For example, taking the key, value pairs of fruits below when we call the sorted function on it, it sorts the dictionary by the alphabetic order of the names of the fruits.

This is a well done sort of dictionary by keys. Notice that when I called sorted the iterable I used is fruits.items() instead of fruits. This is because I wanted to get a view into both the keys and values on the output. If I had used fruits only, then it would have given me a list of only the keys.

Compare this code and the code before it and see for yourself how the output using fruits as iterable is different from that using fruist.items().

So, what if I want to sort the dictionary by values in python. That is where using the key argument comes in. From the ordering of the view given by fruits.items(), which are tuples of (key,value) pairs, what I do is modify the lambda function to catch only the value which is the index 1 in each tuple. So, just study the below code.

What I modified in the code is to insert the expression for the key using a lambda expression and then make it refer to the value index in the tuple.

So, you now know how to sort a dictionary by key and how to sort a dictionary by value in python.

Now for the second keyword argument, the reverse argument.

The reverse argument for sort and sorted functions

The reverse argument has Boolean values. When the value is True, you are asking the sort or sorted function to arrange the outcomes in descending order. When it is False, the default, you are asking it to arrange them in ascending order. It’s as simple as that.

Now, for everything we use examples. So, let’s take examples. We’ll use our initial list of names and ages and sort by ages in ascending order and then descending order.

First, ascending order, the default, and later in descending order.

Notice that to change from ascending to descending order I only changed the reverse argument value from False to True. That’s it.

So, I believe you have all you need to do sorting in python. Experiment to your heart’s delight.

Happy pythoning.

Classes For Graphs and Directed Graphs In Python: Graph Theory

In computer science and mathematics, graphs are ubiquitous. They are just everywhere. We use graphs to solve a lot of problems that involve relationships. Since 1735 when the Swiss Mathematician, Leonhard Euler, used what we now know as graph theory to solve the Seven Bridges of Königsberg problem, graphs have become a brand name of sorts. That is why I decided to write a post on graphs and explore graphs in subsequent posts.

python graphs and directed graphs

 

What are graphs?

In simple terms, graphs are structures used to represent the relationship between objects (called vertices or nodes) where two objects (or nodes) have an edge connecting them if they are related. Diagrammatically, they are depicted with a set of dots or circles for the objects or nodes, and related objects are joined by lines called edges.

The graph below has 6 nodes or vertices, and 7 edges.


 

The edges of a graph may be directed or undirected.

I will be writing code for both directed and undirected graphs. What made me attracted to writing code on graphs was because they are used in every area of life. From scientists to businesses, graphs are used to model solutions to problems.

So, let’s start with writing classes for graphs and we will implement them.

First, we’ll create a class for a Node and an Edge.

A node is just an object in a graph. One attribute every object has is a Name. So, we’ll give our node a name attribute to start with. Here is the code for the Node class.

    
class Node(object):

    def __init__(self, name):
        ''' assumes name a string '''
        self.name = name

    def get_name(self):
        return self.name

    def __str__(self):
        return self.name

Every instance of a Node, as you can see from the code, has a name and each instance has a method, get_name, which you can use to retrieve the name.

An edge is a relation connector between objects. If two objects are connected to each other by a relationship, they will have an edge between them. Edges can be directed or non-directed. Let’s model the Edge class to start with.

    
class Edge(object):

    def __init__(self, src, dest):
        '''assume src and dest are nodes '''
        self.src = src
        self.dest = dest

    def get_source(self):
        return self.src

    def get_destination(self):
        return self.dest

    def __str__(self):
        return self.src.get_name() + '-->' + \
            self.dest.get_name()

From the code you can see that each instance of an Edge has a source node, self.src, and a destination node, self.dest. On creation of a node the source and destination nodes have to be passed as arguments to the constructor, __init__. Then I added a special method for representing an Edge as a string of source and destination nodes, __str__(). This would make for easy printing.

Now that we have the Node and Edge classes, let us go on to model the directed graphs and undirected graphs.

Directed graphs in Python code

A directed graph is a graph in which edges have orientations. The relationship in directed graphs goes one-sided and never both way. The edges are represented by arrows.

A simple class for a directed graph might be written in the following way:

    
class Digraph(object):
    # nodes is a list of the nodes in the graph
    # edges is a dict mapping each node to 
    # a list of its children 
    def __init__(self):
        self.nodes = []
        self.edges = {}

    def add_node(self, node):
        if node in self.nodes:
            raise ValueError('Duplicate Node')
        else:
            self.nodes.append(node)
            self.edges[node] = []

    def add_edge(self, edge):
        src = edge.get_source()
        dest = edge.get_destination()
        if not (src in self.nodes and dest in self.nodes):
            raise ValueError('Node not in graph')
        self.edges[src].append(dest)

    def children_of(self, node):
        return self.edges[node]

    def has_node(self, node):
        return node in self.nodes

    def __str__(self):
        result = ''
        for src in self.nodes:
            for dest in self.edges[src]:
                result = result + src.get_name() + \
                    '-->' + dest.get_name() + '\n'
        return result[:-1] # remove last newline

The class, Digraph, represents a class for objects of a directed graph. If you look at the constructor, we are representing all the nodes in the graph as a list, while the edges are represented by a mapping of nodes to child nodes which mapping goes only one way. Therefore, a dictionary data structure was used for this mapping. To have a graph, it needs nodes. To add a node, we use the method add_node that takes a node as its sole argument. To add a node, we need first to check if the node already exists in the list of nodes and if it does, the method raises a ValueError. If not, it then appends the node to the list of nodes in the graph and then creates a mapping for that node with its values as an empty list that would later be populated when edges are added. To add an edge to the graph, we use the add_edge method. We first initialize the source and destination nodes of the edge, and before adding the edge we check that the source and destination nodes are already in the list of nodes for the graph. If they are not in the list, we raise a ValueError exception. If no exception is raised, a directed edge is created with the source, src, mapped to its value, the destination node, dest. The class also has complementary methods, children_of, that would bring out a list of the nodes connected to that node as source, and also another method, has_node, that returns a Boolean value after evaluating whether the node in question is in the list of nodes for the graph.

That sums up our directed graph, Digraph, class. Now, let’s show that the code works. Let’s implement it by creating an instance of a directed graph, or Digraph. The Digraph instance I will be creating will be based on the directed graph below with 5 nodes and 6 edges. 

python directed graph

 

The only new code is the driver code that creates a Digraph instance. For the sake of brevity, I would recommend that you read the driver code starting from line 67 down to line 106. It is really an exciting code. I hope you appreciate it.

Now the question I asked myself is: why create a class for a graph and not just write code direct? This is because I would be reusing the code in the future. So, we will be coming back to this code for solving problems involving graphs in the future. Maybe you could bookmark this code for the Digraph class. You can download the file here, directed_graph.py.

Undirected graphs or simple graphs in Python code.

An undirected graph or graph for short is a connection between a pair of nodes using their edges. The edges can go both ways which distinguishes it from directed graphs that have orientations.

Now while writing the code for undirected graphs, I ran into a dilemma about inheritance. I was stuck between which graph should inherit from which. Should a directed graph inherit from a undirected graph or should it be vice versa? I decided that it was best for a graph to inherit from a directed graph. This was because instances of graphs can substitute for instances of digraphs and still add one more behavior by making the relationship go the other way. But instances of digraphs cannot stand as substitutes for instances of graphs; digraphs relationship goes only one way. Therefore, I decided to make digraph the superclass and graph the subclass.

Now, this is the code for the class graph.

    
class Graph(Digraph):

    def add_edge(self, edge):
        Digraph.add_edge(self, edge)
        rev = Edge(edge.get_destination(), edge.get_source())
        Digraph.add_edge(self, rev)                                            

Notice that the class, Graph, is inheriting from Digraph so it shares the same attributes with Digraph instances but it only overrides the add_edge method of Digraph. In the add_edge method of Graph I made it that the relationship can go both ways i.e every node in a relationship or edge is both a source and destination node for that edge.

So, for a little implementation that creates an instance of a Graph, I will be modeling the Graph pictured below:

python graph example

 

Run the code and notice the differences between this instance of a Graph and instances of a Digraph. The driver code that creates the Graph instances starts at line 73. You can alternatively download the code and run it on your own machine, graph.py script, so you can bookmark it.

I hope to see you in the future when we begin solving problems with graphs like the traveling salesman problem. To receive updates when I post new articles, just subscribe to my blog.

Happy pythoning.

How To Reverse String In Python: 3 Techniques

After writing the post on reversing lists in python, I got a good response that I decided it would also be fine if I approach a similar concept in python: how to reverse a string in python.

python reverse string

 

One thing that makes people find this subject difficult is because python does not have a string method that is specifically built for reversing strings just like we have for lists. If there was one, that would have made our task easier because it would be optimized. So you have to be creative in looking for ways to reverse python strings. In this post, I outline three methods you can reverse strings in python as well as their timing characteristics.

First, using the python slice technique

The python slice technique, as I explained in the post about reversing lists through slicing, involves taking a slice of a sequence (and strings are also sequences) through definite start, stop, and step parameters. The slice notation in python is [start: stop : step] where start is where you want to start the slice, stop is at the index you want to stop the slice and step is the steps you take through the index while iterating through the items in the string. To reverse a string using the python slice technique, you just need to use this code on the string: string_name[ : : -1].

What the code above does is to copy the string, string_name, from the beginning to the end backwards.

Here is an example so you can see it for yourself.

Nice and easy not so? This method does not change the original string. In fact, as you must know, strings are immutable.

This is the method I prefer for reversing python strings.

Second, using the built-in reversed function.

I explained the reversed function on the post for reversing lists. But one more explanation would be succinct here. The syntax for the reversed function is: reversed(seq). As you can see, the function takes any sequence and reverses it. A string is also a sequence as I explained on the iterables post. When the string is reversed the function returns an iterator. What you do here is cast the iterator to a string. I think that is what gives this function an overhead, the process of casting the iterator to a string, otherwise it is very good and fast. You use the str.join(iterable) method of the string class to cast the iterator to a string.

Here is code that shows how to use the function and also cast it to a string.

This method is very succinct and readable across all levels of python skill.

Third, using a for loop and storing the characters backwards.

With this method all you need to do is use a for loop to iterate through the characters in the string, then you store the characters backwards in a new string. Very convenient and easy to understand.

Here is the code:

I hope you found the code to be fun.

Now, the question most persons ask is: Which is faster or which uses less system memory?

Let’s answer that question by timing the execution of each of the string reverse techniques in python and decide on which is faster or more optimized.

Timing all three techniques.

Try running the following code:

When you run it you will notice that the python slice technique took considerably lesser time than all the others while the for loop took more time. That is why I use the slice technique to reverse my strings.

There was a time when someone told me that during an interview question the interviewer told him that the slice technique is not optimized for reversing strings. I laughed. I have not yet seen anything better than the python slice technique for reversing strings. For now it is the fastest of all the methods I know of and I have outlined the three useful I use here. I use the for loop when I am lazy and just want to be unpythonic.

So, thanks for reading. I believe that now you have techniques you can use to reverse your strings. Leave a comment below if you have any. Also, do not fail to subscribe to my blog so you can be receiving useful updates the moment I post new articles.

Happy pythoning.

How To Reverse List In Python: 4 Techniques

Very often I get people asking me to write a post on how to reverse a list in python. This is because this question often comes up in interviews. So, to oblige them, I have decided to write on four good and tested ways you can reverse a list in python. I also show their timing so that you can choose the best for your needs.

 

python reverse list

The built-in python reversed function

The syntax of the built-in python reversed function is reversed(seq). You can see that it takes in any sequence as argument. Sequences are lists, strings, tuples etc. For a refresher on sequences, see my post on iterables. The function returns an iterator. Remember that an iterator is an object that has elements but in order to extract the elements you need to cast it to list or call the next method. Most times, you cast it to a list to get out the elements. But casting afterwards to a list for this method of reversing could be an overhead cost for the method although it is easy, and uses substantially less memory unless you are casting. This method is ideal for times when you are dealing with very large lists and just want to use the elements of the reversed list when needed rather than using all at once. In this instance, you would need to use a python for loop.

Let’s take an example:

You can see from the above that I had to cast the iterator from the python reversed function to a list to get out the results. That could be an overhead as we’ll see later.

The python slice technique

Slicing a sequence is one of the ubiquitous techniques you can find with python lists and sequences. The syntax of slicing is [start:stop:step] where start is the index you want to start the slice, stop is where you want to stop the slice and step is the steps you want to take when iterating through the list to create the slice. To reverse a list using the python slice technique, you just need to use this statement: list_name[::-1], which is a shorthand for saying copy the list and then walk through it backwards.

Here is an example:

The advantage of this technique is that it is adaptable for any sequence and not just for lists. Some people claim that it is not readable but I find that argument obscure. Slicing is common in python even for the beginner. The only disadvantage I see with the python slice technique is that it uses up memory if you have a large list. This is because to create the reversed list, it needs to copy the original list and then reverse it. This sequence takes up a large chunk of memory. But when you want the original list to remain unchanged, this technique is good and recommended.

The python list reverse method

The python list reverse method is a method of the list class. The syntax is list.reverse(). In my opinion, it seems to be the easiest since it is built for lists and seems to be the fastest so far. But we will consider that in the timing section below. Unlike in the built-in python reversed function, it does not create any overhead and unlike the slicing technique, it does not require large chunk of memory even when working with large lists. It is more optimized for reversing python lists.

The advantageous fact about it is that it reverses in place. But if you want to make use of the original list after reversing, then this technique is not for you.

Here is an example:

I usually use this technique whenever I am reversing lists but if I need the original, I use the slice technique. Just to make you know.

Reverse list programmatically by swapping items

Now the last method you can use is to reverse the list programmatically by swapping items in place. You can write your own code that iterates through the elements of the list and swaps the elements in place. Here is a sample code:

This code can run fast but is not optimized for large lists. It swaps the elements in place and modifies the original list. It is worse than all the python built-in methods.

Timing the methods.

Most times when we are dealing with large lists, we want something that works very fast and doesn’t use much memory. Although with the built-in timing functions in python we cannot calculate for memory usage, but we can find out how long it takes each of the techniques above to run. We will need to import the timeit module to do this.

Here is a sample code for all three built-in methods except the programmed reverse list swapping method. The swapping technique takes a longer time for large lists that is why it is excluded.

When you run the code above, you will see that the list reverse method takes the shortest time of all three methods. Overall, its running time is 12 times lesser than the reversed method. The reversed function took longer time because of the list casting overhead. If we had a for loop, it would have taken less time. The slicing technique comes second place. So, that is why I use the python list reverse method often when reversing lists.

The list reverse method works better because it has been optimized by the developers of python for lists. I believe they look to these things that is why it was made to be a method of the list class.

So, now you have the options you can choose from while reversing lists. Use any to your heart’s desire.

Happy pythoning.

The 0/1 Knapsack Problem In Python Simplified

When I wrote a post about an idea I had at a bank about making change from $5 and $8, a reader wrote me that this was the knapsack problem. I decided to research it. I discovered that although my post was not a knapsack problem, it was somewhat similar because it involved resource optimization. So, intrigued by the knapsack problem, I decided to write code on the 0/1 knapsack problem.

0/1 knapsack problem in python

 

What is the 0/1 knapsack problem?

The knapsack problem was stated using a burglar and how difficult it is being a burglar especially when you are greedy. Imagine a burglar breaks into a home carrying a knapsack with a fixed capacity. What he can steal is limited by the capacity of the knapsack. But the problem is that the items in the home have more value than what his knapsack can carry. So, he has a decision problem. What would he decide to take that would fit into his knapsack? In this problem, he cannot take pieces of items but he either chooses to take the item completely or leaves it (that is why it is called the 0/1 knapsack problem). How does the burglar decide on what to choose?

Let’s take an example of the items the burglar might find in the home. For example, the items he might need to choose from might be a clock, painting, radio, vase, book and computer with values and weight like in the table below:

Name Value Weight Value/Weight ratio
Clock 175 10 17.5
Painting 90 9 10
Radio 20 4 5
Vase 50 2 25
Book 10 1 10
Computer 200 20 10

Which items would he decide to take? We are going to analyze what options the burglar can make.

Why being greedy might not be optimal here.

The burglar has three dimensions of greed in making a choice. He would choose the best item first, then the next best, and so on until he reaches the limit of his knapsack. But what would best mean for him? Could it mean the most valuable, the least weight, or the highest value/weight ratio? Let’s give a value to the capacity of his knapsack and say he can only take items up to 20 kg. So, given a knapsack of 20 kgs, what combination of items can he take?

A stupid burglar or one in a hurry would just pick the computer and say that would give him the best value. But that is not the case. An intelligent burglar would go through his value system and decide what is best for him. But he doesn’t have all the time in the world. So, on the spur he chooses what is best based on his greed and picks accordingly. He has three dimensions of greed to choose from: based on value, based on least weight, and based on the density or value to weight ratio of the items.

I have written here a little code based on the greedy algorithm. It is named my_greed.py.To make the decision, the burglar would have to sort the items ranking them in order and picking each item until he exhausts his knapsack. You can download the code and run it on your machine and test it to see the solutions he would get for his greed.

If you run it you would get the following results. Based on values, if he picks the clock, radio and book, he would steal items that are worth 275. If he decides based on weight, he would pick the book, vase, radio and painting which are all worth 170. If he decides based on density or value to weight ratio, he would pick the vase, clock, book and radio which are all worth 255. So, the best solution for the burglar on the spur is to pick based on value where he gets a total value of 275. Unfortunately, because of greed, some burglar would just pick the computer which is only worth 200.

But choosing based on greed is not always the optimal solution. The burglar wouldn’t be able to consider all the possibilities of his choice. So, that is why I am not explaining the code for greed on this post. You need to study it to understand it yourself. I want to dwell on using the optimal solution.

The optimized algorithm for the knapsack problem.

I thought carefully about this problem and decided that the best and optimal solution for the burglar is to do the following (but he wouldn’t have all the time in the world):

1. Enumerate all the possible combination of items.

2. Remove all the combinations whose weight exceeds the capacity of the knapsack.

3. From the remaining combinations, choose any whose value is the highest.

So that is the focus of this blog. Using an optimal algorithm in getting a solution to the 0/1 knapsack problem. We will go through all the steps above one after the other using code. I hope you ride along.

First, we need to create the class for the items.


# create the items class
class Items(object):

    def __init__(self, name, value, weight):
        self.name = name
        self.value = value
        self.weight = weight

    def get_name(self):
        return self.name

    def get_value(self):
        return self.value

    def get_weight(self):
        return self.weight

    def __str__(self):
        result = f'{self.name}: Value = {self.value}. 
                          Weight = {self.weight}.'
        return result    

In the constructor, each item instance has a name, value and weight. Then there are the getters for the attributes. Then the last method is the __str__() method that specifies how to represent each object on the output.

Then the next thing to do is to build the Items into a list. With the list of items, we can create a superset later.


# build the items in a list
def build_items():
    names_list = ['Clock', 'Painting', 'Radio', 'Vase', 
                                   'Book', 'Computer']
    values_list = [175, 90, 20, 50, 10, 20]
    weight_list = [10, 9, 4, 2, 1, 20]
    collection = []
    for i in range(len(names_list)):
        collection.append(Items(names_list[i], values_list[i], 
                                              weight_list[i]))
    return collection

I created a separate list for each attribute and then in the for loop I created each item instance with their data. Then returned the list, collection.

Now that we have all data items in a list, we now need to enumerate all possible combinations of the items.


# create the powerset
from itertools import combinations, chain

def powerset(iterable):
    s = list(iterable)
    # powerset removes the empty set
    return list(chain.from_iterable(combinations(s, r) 
                         for r in range(1, len(s)+1)))

Notice that I imported the combinations and chain methods from itertools to create the powerset. It’s very easy. If you want an explanation of what was involved in creating the powerset, just read this blog post on combinations.

Then the last two activities to do is to remove all combinations whose total weight exceeds the capacity of the knapsack. When we have done that, from the remaining combinations, find out which of them gives us the maximum value because now all the remaining combinations can fit into the knapsack. This code does both activities.


# the main code
def choose_best(pset, max_weight):
    best_val = 0.0
    best_set = None
    for items in pset:
        items_val = 0.0
        items_weight = 0.0
        for item in items:
            items_val += item.get_value()
            items_weight += item.get_weight()
        if items_weight <= max_weight and items_val > best_val:
            best_val = items_val
            best_set = items
    return (best_set, best_val)

Now let me explain the code a little. The arguments to the choose_best method are powerset and max_weight, which is the weight of the sack. I then initialized best_val to capture the highest value among the combinations and best_set, to capture the best combination. Then in the outer for loop, I initialized the variables items_val and items_weight which are used as total values and total weights for each combination. Then in the inner for loop I iterate through each of the items in any chosen combination, getting their total value and total weight. Then, what I did next is check for whether the total weight, items_weight, is less than or equal to sack weight, the max_weight, and also if the total value, items_val, is greater than any other total value in previous combinations, if both cases are true, then it sets the total value, items_val, as the best value, best_val and tags this combination of items, best_set as the chosen combination. It then returns the best value, best_val, and the chosen combination, best_set, as a tuple.

Now the code to test the main code.


# code to test
def test_best(max_weight=20):
    items = build_items()
    pset = powerset(items)
    taken, val = choose_best(pset, max_weight)
    print('Total value of items taken is', val)
    for item in taken:
        print(item)

Since the weight of the knapsack is 20 kg, I set the max_weight as default of 20. You can specify something else when you are running the code. Then in the body of the function I built the items or collect the items into a list, then create the powerset of all the items. After that, I called the choose_best method to make the optimal choice or solution. Then in the final part of the code I use a for loop to print out each of the items that were in the chosen combination.

Here now is the complete program so you can run it here.

If you desire to study it in-depth and also run it on your own terminal, here is the download file, my_items.py.

That’s it. It’s fun coding the 0/1 knapsack problem. I hope you enjoyed it. I did.

You can leave comments below. Also, subscribe to my blog so that you can receive latest updates.

Happy pythoning.

Python Combinations Function – The Power To Choose

Let’s imagine this scenario. You are a fund manager who is in charge of several stocks. Your company has given you about 20 stocks to evaluate and asks you to find out what 5 stocks from the 20 you can include in your portfolio this year. You have a choice of selecting the 5 stocks which have equal probability of success. How many different selections can you make?

Ever seen a problem like this in college mathematics? Yes, it is an example of a combination problem. We see it all the time in life. In choosing what clothes to wear for the week, what combination of food to choose from a menu, or what combination of channels to watch for the week. We cannot do without combinations.

python combinations function

 

In simple terms, combinations can be defined as the number of possible arrangements you can make from a collection of items where the order of the selection does not matter. Combination is different from permutations because in permutations the order of selection matters.

Let me not bore you with the mathematical details. Let’s go straight to how python allows you to use the power of combinations.

How Python combinations work

To carry out combinations in python you need to import one function, the combinations function from the python itertools module. You can use the code: from itertools import combinations. Very simple. With that you are good to go.

The syntax for the python combinations function is: itertools.combinations(iterable, r) where iterable is the collection you want to select from and r is the number of possible arrangements you want to make from the collection. Note that r should not be greater than the length of the iterable otherwise python combinations function will return an empty object. When you call the combinations function, it returns a combinations object which is an iterator. You can cast the iterator to a list or set to extract the elements of the combination or the arrangements.

Now that the syntax is done, let’s solve the fund manager’s problem we started with.

The problem the fund manager is faced with is that out of 20 stocks he has to select 5 without order since they are all equally probable of success. How many selections or arrangements can he make?

I have included comments in the code above so you can follow along on the logic behind how it was applied. On line 1, we imported combinations from itertools module. That means we are good to go. On line 3, using range function, we created a collection or sequence of 20 items. So easy. On line 4 we called the combinations function and passed the collection or sequence as its first argument and then 5, the arrangements we are making, as its second positional argument. The python combinations function returned a combinations object which is an iterator, and in the next line we cast the iterator to a list so we can extract the items in it. But not to worry, we are not investing in any of the stocks yet, we just want to know how many selections the fund manager can make. So, on the last line we called length function on the list and it gave us the answer: 15504 possible arrangements of the stocks. I bet, the fund manager needs more than a voodoo priest to decide on what arrangement of stocks to choose.

I believe that right now you understand how the python combinations function works. But am not going to leave you without one more example. I so love this one because I use it often on a weekly basis.

For example, Michael loves eating 5 types of foods but he can only choose three of them every day. If the order he chooses each meal is not important, how does he choose. Also, how many choices can he make? This is just easy, right? Let’s do it.

It’s so easy, not so? I believe you can read and follow along with the code above. It’s one of the easiest codes I’ve written this week. If you look at the food choices, you would notice that rice stands out prominently. Well, because order doesn’t matter it makes no difference if rice is at the beginning of a choice or the end for each day. Why you get the print out is because combinations prints out the arrangements based on the order it finds the items in the sequence or collection. If you want a different order, you can sort the food list. Try it out on your machine and see.

Now, let me give you a bonus tip. The combinations function in python makes it possible for you to do a calculation that before now took a very lot of processing to carry out. That is calculating the powerset of a set. Before I discovered the combinations function, I used to calculate powerset of a set based on an algorithm that was of exponential complexity. You get what I mean? It took a lot of time but when I discovered combinations function, all that stress was put to rest.

How to calculate powerset of a set using python combinations function.

The powerset of a set, S, can be defined as the set consisting of all subsets of S, including the empty set and S itself. So, that’s the mathematical definition and that is the result we expect to have in our code.

To get the powerset of any iterable from the combinations function, we will use the following code:


from itertools import combinations, chain

def powerset(iterable):
    s = list(iterable)
    return chain.from_iterable(combinations(s, r) 
                      for r in range(len(s)+1))

Notice that this time we are not only importing combinations but also the chain function. The meat of the code lies in the last line of the powerset function. What is happening there is that using a generator expression we are creating combinations with the arrangements, r, going from 0 , 1, 2… to the length of the iterable. This makes sure we are creating arrangements for every combination of the powerset. The generator expression outputs a combination object which is an iterator. To extract the elements we have to cast it to a chain object, which is also an iterator and then cast the result of the function to a list or any other iterable. The casting to a list was done in the example below. Note that the elements will be arranged in tuples since they are combinations of sometimes more than one object. It’s so elegant. No more lengthy and time consuming code.

Let’s try it with a working example.

The code above will print out the powerset of the list, num. Cool, right?

Experiment with these functions to your heart’s delight. They demonstrate the power of python.

Happy pythoning.

Matched content