Search

Python Combinations Function – The Power To Choose

Let’s imagine this scenario. You are a fund manager who is in charge of several stocks. Your company has given you about 20 stocks to evaluate and asks you to find out what 5 stocks from the 20 you can include in your portfolio this year. You have a choice of selecting the 5 stocks which have equal probability of success. How many different selections can you make?

Ever seen a problem like this in college mathematics? Yes, it is an example of a combination problem. We see it all the time in life. In choosing what clothes to wear for the week, what combination of food to choose from a menu, or what combination of channels to watch for the week. We cannot do without combinations.

python combinations function

 

In simple terms, combinations can be defined as the number of possible arrangements you can make from a collection of items where the order of the selection does not matter. Combination is different from permutations because in permutations the order of selection matters.

Let me not bore you with the mathematical details. Let’s go straight to how python allows you to use the power of combinations.

How Python combinations work

To carry out combinations in python you need to import one function, the combinations function from the python itertools module. You can use the code: from itertools import combinations. Very simple. With that you are good to go.

The syntax for the python combinations function is: itertools.combinations(iterable, r) where iterable is the collection you want to select from and r is the number of possible arrangements you want to make from the collection. Note that r should not be greater than the length of the iterable otherwise python combinations function will return an empty object. When you call the combinations function, it returns a combinations object which is an iterator. You can cast the iterator to a list or set to extract the elements of the combination or the arrangements.

Now that the syntax is done, let’s solve the fund manager’s problem we started with.

The problem the fund manager is faced with is that out of 20 stocks he has to select 5 without order since they are all equally probable of success. How many selections or arrangements can he make?

I have included comments in the code above so you can follow along on the logic behind how it was applied. On line 1, we imported combinations from itertools module. That means we are good to go. On line 3, using range function, we created a collection or sequence of 20 items. So easy. On line 4 we called the combinations function and passed the collection or sequence as its first argument and then 5, the arrangements we are making, as its second positional argument. The python combinations function returned a combinations object which is an iterator, and in the next line we cast the iterator to a list so we can extract the items in it. But not to worry, we are not investing in any of the stocks yet, we just want to know how many selections the fund manager can make. So, on the last line we called length function on the list and it gave us the answer: 15504 possible arrangements of the stocks. I bet, the fund manager needs more than a voodoo priest to decide on what arrangement of stocks to choose.

I believe that right now you understand how the python combinations function works. But am not going to leave you without one more example. I so love this one because I use it often on a weekly basis.

For example, Michael loves eating 5 types of foods but he can only choose three of them every day. If the order he chooses each meal is not important, how does he choose. Also, how many choices can he make? This is just easy, right? Let’s do it.

It’s so easy, not so? I believe you can read and follow along with the code above. It’s one of the easiest codes I’ve written this week. If you look at the food choices, you would notice that rice stands out prominently. Well, because order doesn’t matter it makes no difference if rice is at the beginning of a choice or the end for each day. Why you get the print out is because combinations prints out the arrangements based on the order it finds the items in the sequence or collection. If you want a different order, you can sort the food list. Try it out on your machine and see.

Now, let me give you a bonus tip. The combinations function in python makes it possible for you to do a calculation that before now took a very lot of processing to carry out. That is calculating the powerset of a set. Before I discovered the combinations function, I used to calculate powerset of a set based on an algorithm that was of exponential complexity. You get what I mean? It took a lot of time but when I discovered combinations function, all that stress was put to rest.

How to calculate powerset of a set using python combinations function.

The powerset of a set, S, can be defined as the set consisting of all subsets of S, including the empty set and S itself. So, that’s the mathematical definition and that is the result we expect to have in our code.

To get the powerset of any iterable from the combinations function, we will use the following code:


from itertools import combinations, chain

def powerset(iterable):
    s = list(iterable)
    return chain.from_iterable(combinations(s, r) 
                      for r in range(len(s)+1))

Notice that this time we are not only importing combinations but also the chain function. The meat of the code lies in the last line of the powerset function. What is happening there is that using a generator expression we are creating combinations with the arrangements, r, going from 0 , 1, 2… to the length of the iterable. This makes sure we are creating arrangements for every combination of the powerset. The generator expression outputs a combination object which is an iterator. To extract the elements we have to cast it to a chain object, which is also an iterator and then cast the result of the function to a list or any other iterable. The casting to a list was done in the example below. Note that the elements will be arranged in tuples since they are combinations of sometimes more than one object. It’s so elegant. No more lengthy and time consuming code.

Let’s try it with a working example.

The code above will print out the powerset of the list, num. Cool, right?

Experiment with these functions to your heart’s delight. They demonstrate the power of python.

Happy pythoning.

First Walking Microscopic Robots (Nanobots) To Change The World

Although it has been said several times that the future of nanoscale technology with nanobots is immense, each day researchers continue to expand it. Recently, in a first of its kind, a Cornell University-led collaboration has manufactured the first microscopic robot that can walk. The details seem like a plot from a science fiction story.

microscopic robots or nanorobots

 

The collaboration is led by Itai Cohen, professor of physics, Paul McEuen, the John A. Newman Professor of Physical Science – both in the College of Arts and Sciences – and their former postdoctoral researcher Marc Miskin, who is now an assistant professor at the University of Pennsylvania. The engineers are not new to producing nanoscale creations. To their name they already have a microscopic nanoscale sensor along with graphene-based origami machines.

The microscopic robots are made with semiconductor components that allow them to be controlled and made to walk with electronic signals. The robots have a brain and torso, and legs. They are 5 microns thick, 40 microns wide, and 40-70 microns in length. A micron is 1 millionth of a metre. The torso and the brain were the easy part. They are made of simple circuits manufactured from silicone photovoltaics. But the legs were completely innovative and they consist of four electrochemical actuators.

According to McEuen, the technology for the brains and the torso already existed, so they had no problem with it except for the legs. “But the legs did not exist before,” McEuen said. “There were no small, electrically activatable actuators that you could use. So we had to invent those and then combine them with the electronics.”

The legs were made of strips of platinum. They were deposited by atomic layer deposition and lithography, with the strips being just some dozen atoms thick. Then these strips of platinum are capped by layers of titanium. So, how did they make these legs to walk? By applying a positive charge to the platinum. When this is done, negative ions from the solution surrounding the surface of the platinum are adsorbed to the surface and they neutralize the charge. Neutralization makes the platinum to expand and the strips bend. Because the strips are ultrathin, they can bend on neutralization without breaking. To enable three dimensional motion control, rigid polymer panels were patterned on top of the strips. The panels were made to have gaps and these gaps made the legs to function like knees or ankles, enabling the legs to move in a controlled manner with generated motion.

A paper describing this technology titled: “Electronically integrated, mass-manufactured, microscopic robots,” has been published in the August 26 edition of Nature.

The future applications of this technology is immense. Since the size of the electronically controlled microscopic robots is that of a paramecium, one day when they are more sophisticated, they could be inserted into the human body to carry out some functions like cleaning up clogged veins and arteries, or even analyzing the human brain. Also this first production will become a template for the production of even more complex versions in the future. This initial mcroscopic robot is just a simple machine but imagine how sophisticated and computational complex it will be when it is installed with complicated electronics and onboard computers. Furthermore, to produce the robots do not take much in terms of time and resources because they are silicone-based and the technology already exists. So we could see the possibility of mass-produced robots like this being used in technology and medicine to the benefit of the human race. In fact the benefits are immense when one calculates the economics involved.

“Controlling a tiny robot is maybe as close as you can come to shrinking yourself down. I think machines like these are going to take us into all kinds of amazing worlds that are too small to see,” said Miskin, the study’s lead author.

The frontiers of nanobot technology is expanding by the day. With these mass produced robots in the market, I see a solution in the offing for various medical and technological challenges. This is an innovative nanobot.

Material for this post was taken from the Cornell University Website.

Python Map Function And Its Components

Very often, we want to apply a function to an iterable without using a for loop. We saw an example in the python reduce function but the python reduce function doesn’t really fit want we want to do because the reduce function successively accumulates the results. We want the function to be applied to each element of the iterable and be stored separately. To do that, we would use a python map. It is just the right function for the job. In this blog post, I will show you how to use a python map and also its subsequent, a python starmap, along with usage.

python map function

 

What is a python map.

A python map is just a function that takes iterables and applies a function to the elements of the iterables. If there are more than one iterable, it applies the function to the corresponding elements of the iterable in succession. If the iterables are of different lengths, it stops at the shortest iterable. The result of a python map function is an iterator object. That means to get out the results you have to apply another function to the object. Most times, you would cast it to a list or set.

The syntax of the python map function is map(function, iterable, ...) where function is the function you want to apply to the items of the iterable and iterable is the object which contains the items.

Visit this link if you want to refresh yourself on iterators, and this other one on python iterables. They are important concepts in python. I explained them in depth.

Now, let’s take some examples.

We’ll show examples using a single python iterable and then when more than one python iterable is used. First using a single python iterable.

Supposing we have a list of numbers and we have a function that raises a given number by a power of 3. We could use python map to apply the function to each of the items in the list of numbers.

You will notice from the code above that I used the python map function to apply power_func to each of the items of the num_list in line 5. The first time we printed out the object, items_raised, what we get is a map object. The map object is an iterator. So, to get out the elements in the iterator we cast to a list in line 8 and it then extracted each of the items raised.

Now let’s show an example with more than one iterable. This time, we will use two iterables and add their items together.

What we did above is to provide two iterables, num_list1 and num_list2, to the python map function, and then use the function, adder. What map does is take the items at corresponding indices and pass them to adder which adds them together and then provides the result to the map iterator. Then using list function, we extract each of the elements in the map iterator which is then printed out in line 7.

There is also another case I want you to consider if using more than one iterable and the iterables are not of equal length. What a python map does is that it applies the function to the items of the iterable successively until it comes to the end of the shorter length iterable and then it stops. Let’s take an example, this time letting num_list2 be longer than num_list1.

You can see this time that it stops short at the items at index 4 in both lists and ignores the rest of the items in num_list2 because num_list2 is longer.

One thing you need to know is that the python map function falls short when the items of the iterable is a tuple. This is because map takes each of the items as a single element and was not meant to work with tuples. But not to worry, we have another descendant of map, the python starmap function, that helps us to deal with tuples as elements.

What is the python starmap function?

Just like the python map function, the python starmap function returns an iterator based on the operation of a provided function to the elements of an iterable but this time it works when the elements are tuples. Since you know the basic concepts behind python maps, it also applies to python starmaps. The python starmap function is included with the itertools module. So to use it, you first need to import it from the itertools module.

The syntax for starmap function is itertools.starmap(function, iterable) where function is the function carrying out the operation on each of the elements of the iterable. The elements of the iterable are arranged in tuples.

As an illustration, let’s take an iterable of tuples for example.

As you can see from above, in line 1 we first imported the python starmap function from itertools. Then we defined a function, power, that takes in two arguments and raises the first argument to the power of the second argument at lines 3 and 4. Then at line 7 we used the starmap function to apply the power function to each of the elements of the iterable, this time a list, which are tuples. Python starmap unpacks the tuple when sending them to the power function such that the first item in the tuple becomes bound to x and the second item of the tuple becomes bound to y, and then the function is applied to them such that x is raised to power y and the result is added to the starmap object. Then in line 8 we call list function on the starmap object which is an iterator to extract each of the items in the iterator. Then finally, we print them out.

We can use an iterable with a tuple that has any number of items as long as the function to which they would be applied can accept that number of arguments when the tuple is unpacked by the python starmap function. Let’s take another example of a starmap being used with an iterable that has a tuple of three items.

In the code above, the functioin, is_pythagoras, is based on the Pythagoras rule that the square of two numbers in a triangle is equal to the square of the longest side. What is_pythagoras function does is take three positional arguments and checks for the Pythagoras rule in the arguments. If it obeys the pythagors rule, it returns True but if not it returns False. Then in line 9 we created a list of tuples that is structured with 3 items representing the sides of a triangle, with the third item being the longest side. Then we applied the triangles list and the is_pythagoras function together in the starmap function to check which side obeys the Pythagoras rule. You will notice that line 10 produces a list having either True or False as entries. Then in line 11 to 14, we checked which of the entries in the list has the True value and then printed the corresponding entry from the triangles list of tuples as the tuple obeying the Pythagoras rule.

I hope you enjoyed yourself as I did. These functions show you the powerful abilities of python. Use it to your enjoyment. To keep receiving updates from me on how to use python, you can subscribe to my blog. Thanks for reading.

Happy pythoning.

Python List Comprehension and Generator Expression Toolbox

Python List comprehension, sometimes called listcomps, and generator expression, (genexps), were a notation inspired by the programming language, Haskell. Their aim is to make code more compact, faster, and optimized provided the author does not make the code unreadable. Many a programmer has found these two notations extremely useful when they want to write pythonic code.

python list comprehension and generator expressions

 

To give you an idea behind the inspiration of python generator expressions and list comprehensions along with the syntax for forming them, let’s take a python for loop that iterates through a list of items and appends those items to another list based on a Boolean expression.

The for loop above iterates through a list where fruits are weighted 1 or 2 and placed in a tuple. It then filters all fruits that have a weight of 1 and appends them to the weighted list. To introduce you to the syntax of list comprehensions and generator expressions, we will rewrite the code using list comprehension:


fruits = [(1, 'mango'), (2, 'apple'), (1, 'orange'), (1, 'pineapple'), (2, 'melon'), (1, 'banana')]
weighted_list = [ item[1] for item in fruits if item[0] == 1]
print(weighted_list)

You can run it in the embedded interpreter here:

Compare the two outputs and you will see that they generate the same lists. So, now that you have seen a live demonstration of how a python list comprehension is written, let me explain the syntax of python list comprehensions and generator expressions.

Syntax of python list comprehensions and generator expressions.

The basic syntax for python list comprehension is:

[ expression for item in iterable if condition ]

The basic syntax for generator expression is also:

( expression for item in iterable if condition )

It consists of a for statement which could be followed by an optional if statement and then an expression is returned. Notice that they both have the same syntax. The only difference is that python list comprehensions are surrounded by square brackets while python generator expressions are surrounded by parenthesis. Also, what list comprehension does is to return the expressions as a list while generator expression returns an iterator. So, notice this difference between what they both return because it is very important.

If you want a refresher on what an iterator is or what iterables are, just click on the links.

So, having that syntax, let us show examples of common operations you can use with list comprehensions and generator expressions.

Common operations of list comprehensions and generator expressions.

Two common operations that these two notations perform are: (1) To perform some operation on every element of an iterable. (2) Selecting a subset of elements of an iterable that meets some condition.

Most of the operations you will perform with list comprehensions or generator expressions will fall under one of these two broad categories.

  1. Performing some operation on every element of an iterable.
  2. Let’s demonstrate this with python list comprehension examples and generator expression examples.

    Suppose you want to add up all the elements of a range of numbers as they are produced. You could most probably use a generator expression for a compact and optimized code. Here is how:

    
    sum_of_num = sum(x for x in range(1, 21))
    print(sum_of_num)
    

    With the code above I just added all the numbers from 1 to 20, and the output was 210. Note that since the python generator expression produces an iterator, sum function takes each element of the iterator and adds them together. When you have an iterator that needs operations on their elements, just think of generator expressions.

    We could also perform operations on elements of an iterable using list comprehension.

    
    sum_of_num = sum([ x for x in range(1, 21)])
    print(sum_of_num)
    

    Notice that I enclosed the list comprehension inside the sum function. This is because sum function can also take an iterable. A list is what the list comprehension produces which is an iterable. Like the generator expression, the list comprehension produced the sum of num as 210. I want you to study both syntax very well and make sure you understand what I did. Now, for further clarification, let me show you the lengthy for loop that the above codes replaced that took longer lines.

    
    summed = []
    for x in range(1, 21):
        summed.append(x)
    print(sum(summed))
    

    You can see that the python list comprehension and generator expression look more pythonic. One liners are so beautiful. Just compare them and see for yourself.

  3. Selecting a subset of elements of an iterable that meets a condition.
  4. Sometimes we want to select elements of an iterable based on a condition being False or True. List comprehensions are usually the handy tool for that job. When I see problems like this, I chose list comprehensions because generator expressions being iterators usually need a function to help bring out the elements. But I will show you how to do this action using both.

    For example, suppose we have some numbers and we want to select only the even numbers in the list based on whether the numbers are divisible by 2 without remainder. This is how list comprehension could be written for it.

    Run the code above and see for yourself and study the syntax of the list comprehension. You should see a print out of the even numbers selected as a list. Yes, that’s what a list comprehension produces – a list. Now we could do this with a generator expression but because we don’t have a function acting on the elements selected, it doesn’t look that too elegant. We will take note of the fact that the generator expression produces an iterator, so based on the definition of iterators which you remember implements the __next__() special method, we would use the next() method to bring out all the items selected from the range. But since they are ten in number, we would have to call next() ten times.

    You don’t expect me to call the next() method 10 times, do you? So you see, for occasions like this when you just need the numbers without doing any operation on them, a list comprehension would suffice.

So, we have our two common broad operations where list comprehensions and generator expressions are usually used.

So, what are the differences between the list comprehension and generator expression?

Differences between list comprehension and generator expression.

The first difference is that while list comprehension will produce a list, a generator expression will produce an iterator. And remember from the post on iterators, they don’t need to materialize all their values at once unless you need those values.

The next difference is that if you are dealing with iterables or iterators that return an infinite stream or a very large amount of data, list comprehensions can compromise your memory. List comprehensions are not memory friendly in this instance because while creating the list they have to use a large amount of space for large data. So, these is where generator expression trumps list expressions because they are more memory friendly, giving you data only when you need them.

As I explained above, you surround python list comprehensions with square brackets while python generator expressions are surrounded with parenthesis. Just wanted to repeat it again. Same syntax but different environment.

As I showed you above, when you just want to select items, it might be better to use a list comprehension but when you want a function to act on the items themselves, a generator expression works better.

Python List comprehensions and generator expressions support nesting

There are times when you have a nested loop. Python list comprehensions and generator expressions also support nesting. Any level of nesting. But be careful not to nest too deeply that the code becomes unreadable.

Let’s show some nesting examples.

On the code above, I wanted to create of list of names to fruits tuples. It is recommended that if a list comprehension will output a tuple, you should surround the tuple with parenthesis as I did above. This is to prevent ambiguity in your code. This just shows that nesting is possible in python list comprehensions and generator expressions.

Nesting can also involve the if statement but be careful that they are well arranged.

So, that is the guide to list comprehensions and generator expressions. Use them with responsibility.

Happy pythoning.

5 Python Directory Handling Techniques

Directories and files are crucial to a programmer who wants a resource for his programs. That is why it is necessary after discussing python’s file handling methods, one should also undertake an understanding of python’s directory handling methods or routines. In this post, I will describe 5 routine ways one can handle directories using the methods provided in python such as the python make directory method and get working directory methods.

python directory methods

 

To be able to run any of the commands in this post, you first need to import the os module into your interpreter. To do that you use the code: import os.

Getting the python current directory

The current directory is the directory from which the python interpreter is operating. It depends on how you launch your interpreter or your editor. To know your current working directory is easy. You just need to call the python get current working directory method, os.getcwd(). Here is an example:


import os

working_dir = os.getcwd()
print(working_dir)

The code above will make the current working directory to be printed on your terminal.

You might desire to change your current working directory. Maybe you want to do some experiment on some programs and want to run them on a directory you intend to delete later; I do that all the time. Changing the current working directory is easy with python. You use the python change working directory method, whose syntax is: os.chdir(path). It states that you have to provide a path as an argument for the directory you want to switch to. Path should be based on the path specification of your operating system. It is wise to make path a string in all cases.

An example will suffice.


import os

os.chdir('C:\\Users\\Michael\\Desktop\\')
print(os.getcwd())

Notice above that I double escaped the backslash character. This is because the backslash is a special character. When I run the above code, it changed my working directory to ‘C:\Users\Michael\Desktop\’. Also, I am working on a windows 10 computer in case you are using Unix, Linux or Mac.

Creating New Directories with Python

There are occasions you want to create new directories, or what some call make new directories in python. Python can do this very easily when you use the right methods. There are two methods provided in python: the python make directory method, os.mkdir, and the python make directory recursive method, os.makedirs, which acts recursively by creating more than one directory as long as the directories do not already exist.

The syntax of the os.mkdir method is: os.mkdir(path, mode=0o777, *, dir_fd=None) where path is the name of the directory you want to create. You can leave the other keyword defaults as is because on some systems the mode parameter is just ignored and directory file descriptors, dir_fd, are not implemented.

Supposing we want to create a directory called, new_dir, we could try the following:


import os

try:
    os.mkdir('new_dir')
except FileExistsError:
    print('Directory already exists.')
else:
    print('Directory created successfully.')        

I used a try statement to make sure that the directory doesn’t exist before creating it. This is because a FileExistsError is raised if the directory already exists. That gives peace of mind.

The os.makedirs method is also used to create directories but it does this recursively. That means, you can use it to create successive directories. The syntax of the method is: os.makedirs(name, mode=0o777, exist_ok=False). Path is the name of the directory you want to create. It has a different argument though from the python make directory method that is worth mentioning. It has an exist_ok keyword argument which you can set to True if you want to create subdirectories of an already existing directory. Let’s use an example:


import os

try:
    os.makedirs('new_dir\\second_dir\\third_dir', exist_ok=True)
except FileExistsError:
    print('Directory already exists.')
else:
    print('Directory created successfully.')        

If you run the above on your machine, it creates the directories second_dir and third_dir (remember new_dir has already been created) and prints: ‘Directory created successfully.’ This is because I set the exist_ok argument to True i.e it should create subdirectories even where a directory already exists. The exist_ok argument comes convenient.

How to remove a directory in python

The methods under this category come in handy when you no longer need a directory. You can programmatically remove directories using python with the python remove directory method, os.rmdir, and the python remove directory recursive method, os.removedirs. The latter removes directories recursively. I didn’t tell you in the earlier post on file handling, but you can also remove files if you want to using the python remove file method, os.remove. I will describe all three here.

To remove a single directory, you use the python remove directory, os.rmdir, method. The syntax of the method is: os.rmdir(path, *, dir_fd=None). Path is the name of the directory you want to remove. With this method, you cannot remove directory trees or directories that are not empty otherwise it will raise OSError exception. If the directory does not exist, it will raise a FileNotFoundError.

When I wanted to remove the new_dir created earlier with child directories like this:


import os

try:
    os.rmdir('new_dir')
except OSError:
    print('Directory not empty.')
else:
    print('Directory successfully removed.')        

It printed out: ‘Directory not empty.’ That means I cannot remove a directory with child directories using this method. Not to worry, the second method, the python remove directory recursive method can do that: os.removedirs

The syntax of the os.removedirs method is: os.removedirs(name) where name is the name of the directory you want to remove.

In the example below, I wanted to remove all the directories and sub-directories we created when making directories.


import os

try:
    os.removedirs('new_dir\\second_dir\\third_dir')
except OSError:
    print('Directory not empty.')
else:
    print('Directory successfully removed.')        

It ran successfully and printed: ‘Directory successfully removed.’ To ensure it doesn’t raise an OSError exception, you should make sure that the leaf directory, third_dir, is empty i.e it doesn’t contain any files.

Now, let’s show the bonus method on how to remove a file.

The method for removing files is the python remove file, os.remove, method. The syntax is: os.remove(path, *, dir_fd=None). Path is the name of the file. If the file is already in use, the method raises an error. Note that the file name, path, should be relative to the current working directory.

In this example here, I want to remove a file that was used when we discussed the file handling methods in an earlier post:


import os

os.remove('eba.txt')
if os.path.isfile('eba.txt'):
    print('File not removed.')
else:
    print('File removed.')    

It ran successfully and printed: ‘File removed.’

How to rename a directory

We can programmatically rename a file or directory in python. There are methods for both single file or directory, or multiple files or directories. The python rename method, os.rename, works for single file or directory, while the python rename recursive, os.renames, method works recursively.

The syntax for the os.rename method is: os.rename(src, dst, *, src_dir_fd=None, dst_dir_fd=None) where src means the source file or directory, and dst means the new name you intend to give the source. The dst or new name should not already exist otherwise the operation will raise an OSError exception or that of one of its subclasses depending on the operating system used.

Here is an example of usage:


import os

try:
    os.mkdir('new_dir')
    print('Directory created successfully.')
    print('Now attempting to rename it.')
    os.rename('new_dir', 'old_lady')
except FileExistsError:
    print('Directory already exists.')
except OSError:
    print('Couldn\'t rename the directory.')
else:
    print('new_dir changed successfully to old_lady.')    

In the example above, I first created a new directory, new_dir, and when it went successfully without raising an error, I then attempted to rename it from new_dir to old_lady. If old_lady already exists, it will raise an OSError exception which I would handle by printing out: ‘Couldn’t rename the directory’ but if it doesn’t exist already, the renaming would run successfully, (which happened) and then print out: ‘new_dir changed successfully to old_lady.’

Now we can do this recursively. What if we create a directory tree with an empty leaf directory. We would have to use the python rename recursive, os.renames, method.

The syntax of the os.renames method is: os.renames(old, new) where old refers to the old name of the directory or directories and new refers to their new names.

Let’s take an example from above again. This time, we want to rename all the directories and sub-directories.


import os

try:
    os.makedirs('new_dir\\second_dir\\third_dir', exist_ok=True)
    print('Directories created successfully.')
    print('Attempting renaming of the directories created.')
    os.renames('new_dir\\second_dir\\third_dir', 'my_first\\my_second\\my_third')
except FileExistsError:
    print('Directories already exists.')
except OSError:
    print('Couldn\'t rename the directories.')
else:
    print('Renamed all three directories recursively.')                

From the above, you could see that I first created a directory tree, new_dir\second_dir\third_dir, and when it was created successfully, I tried an attempt at renaming all the directories recursively using a second try statement. If you do not have the necessary permissions to rename the directory, then the operation will fail. But if the permissions are available and the directories exist as stated in the names for the old directory, then they will be renamed and the code will print: ‘Renamed all three directories recursively.’

You can be creative and try out your own examples to see how it will run.

How to list all the files and nested directories of a directory

I am using windows, so Linux or Unix users pardon me if my example is Windows based. If on windows you want to list the contents of a directory, you use the command ‘dir’ on the command line and it gives you a listing. You can do the same with python. Python has two methods for doing so: a python list directory, os.listdir, method and an optimized python scan directory, os.scandir, method.

It is recommended that you use the python scan directory, scandir, method for most cases, but let me show a working example of the python list directory method. The syntax of the python list directory method is: os.listdir(path='.') where path is the name of the directory whose contents you want to list. The path parameter is optional and where omitted, it defaults to the current working directory. The method returns a list of all the files and directories that are contained in the directory named path.

Here is an example:


import os

dir_list = os.listdir()
for file in dir_list:
    if os.path.isfile(file):
        print(f'{file} is a file.')
    else:
        print(f'{file} is a directory.')    

The above code first returns a list of all the files and directories in the current working directory as dir_list. Then I iterate through the list in a for loop and print out whether an item is a file or a directory. This gives you a listing that is similar to the windows ‘dir’ command line .

Now, for the optimized python scan directory, scandir, method. The syntax of the optimized scan directory method is os.scandir(path='.') where path is the name of the directory. Scandir returns an iterator which yields objects that correspond to the files or nested directories in the path name. You can return the object name, whether they are files or directories, from the objects yielded. (If you want a refresher on iterators or on python generators that yields objects then click on the corresponding links.). Having objects that have file types and attributes increases code performance provided the operating system can provide this information.

Since the iterator produced by the python scan directory method is a resource, you need to close it or garbage collect it by calling the close method, scandir.close(), but you could do this better by using a with statement.

In the example below, we will list the contents of the current working directory again, but this time showing how to do it with the python scan directory method working as a generator.


import os

with os.scandir() as my_dir:
    for item in my_dir:
        if item.is_dir(follow_symlinks=False):
            print(f'{item.name} is a directory.')
        else:
            print(f'{item.name} is a file.')      

I used the with statement so that python will automatically close the iterator immediately the operation ends. You will notice that each object, item, yielded also has attributes of their own. In this example, item object has name attribute in item.name, and also the is_dir method in item.is_dir. This is because the objects are os.DirEntry objects. In the item.is_dir method, in order not to follow symbolic links and list a directory having files as a file, I switched the follow_symlinks parameter to False. This makes it possible to accurately get all directory listings.

Now you have been equipped to use python’s directory handling functions. Go experiment with what you can do with them.

Happy pythoning.

7 Important File Handling Functions In Python

Computer files, or resources for recording discrete data, are usually ubiquitous in python. File handling in python treats files as either textual or binary files and there is no limit to the size of files python can work with. In this post, we will be discussing textual files while in subsequent posts we will discuss binary files and how python handles them. Seven basic functions for handling textual files are discussed.

python file handling functions

 

The Built-in Python Open File function

The built-in python open file function is the first function you will encounter when you want to open any sort of file in python. It is used to open a file and it returns a file object. The syntax for the python open file function is open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None) but for working on textual files, we will focus on the parameters file and mode.

The file parameter to the open file function represents the pathname of the file absolute or relative to the current working directory. The pathname depends on the file system of the operating system. If the file is not found on the call to open file, the function returns a FileNotFoundError exception.

The mode parameter specifies the mode in while the file is to be opened. The default mode is ‘r’ which means open for reading text. Other values are ‘w’ meaning open for truncating and writing to the file, and ‘a’ meaning open for appending to the end of the file. Other modes are ‘b’ meaning open binary file and ‘+’ meaning open for updating (reading and writing). If you want to write to the file without truncating it, then use the combination mode, ‘r+’, which just writes to it at the beginning of the file. If you want to write starting from the end, then use ‘a’.

Now, let’s use some examples.

I will be working with the following text file, eba.txt, that is in my working directory. The contents of the file are:

Nothing beats a plate of eba
as we generally want to eat it
but it often gets stuck in our throats
where it is eaten without a good soup
that is oily and makes the eba
which is a gelly and very hard 
to move smoothly down our throats

You can also download the file here or copy and paste it if you want to use it so you can get the same results as I did.

Now, let’s just open the file without doing any reading or writing. Those functions will come later. After opening the file, we will then close it. It is good practice to always close your files.


try:
    fobj = open('eba.txt', 'r')
except FileNotFoundError:
    print('File doesn\'t exist.')
else:
    print('File opened successfully and file object created.')
finally:
    fobj.close()            

A more pythonic way to do the above, that is, open the file resource and then close it automatically would be writing the following line of code:


with open('eba.txt', 'r') as fobj:
    print('File opened successfully.')

If the file opened successfully, the print will run but if not, you will get a stack trace about a FileNotFound Error.

So you now know how to open textual files. What remains is to do something with them while they are open. The remaining functions will deliberate on that.

The Python Read File Function

The syntax for the python read file function is read(size=-1) where size specifies the number of bytes to read from the file. The default is -1 which means read all the contents of the file as string of characters (we are dealing with textual files here) and return all the contents of the file. If size is specified, then it returns the number of size of the characters from the file. If size is not specified but empty, the python read file function returns all the characters from the file. Be careful when using this feature because if the file is very large, it could interfere with your system memory.

So, for some examples. Remembering we are using the eba.txt file which contents I posted above.

Suppose we want to read only 20 bytes from the file. We will use this code:


with open('eba.txt', 'r') as fobj:
    s = fobj.read(20)
    print(s)

Our output would be:

Nothing beats a plat

Just the first 20 bytes in the file. Later, I will show you how to read from any position with random access to the file.

The Python Readline function

The syntax for the python readline function is readline(size=-1) and unlike read function, it reads and returns one line from the stream. It starts with the first line. You can customize it by specifying size and then the number of bytes in size will be read. The end of line is usually determined by the newline character of the python open file function. The default is to resort to the system defined newline character. For most implementations, the default newline is okay.

Now, to use some examples to illustrate. Imagine we had this code to just read the first line from the text file given.


with open('eba.txt', 'r') as fobj:
    s = fobj.readline()
    print(s)

The output we will get on the terminal is:

Nothing beats a plate of eba

This is the first line of the eba.txt file.

You will notice when you run the above that a new line is printed for each line. You could remove that new line which was created when ‘\n’ was encountered by calling the strip function on the string object returned by the python readline function. Compare the output for the code below using strip function and that for the code above without the strip function on your machine.


with open('eba.txt', 'r') as fobj:
    s = fobj.readline()
    print(s.strip())
 

Using the strip function on the string now strips away the added newline and gives a more beautiful rendering.

The Python Readlines Method

The python readlines method is in plural because it reads multiple lines. The syntax for readlines is readlines(hint=-1) which states that the readlines function reads and returns a list of lines from the stream. The hint parameter is to tell the python readlines function how many lines to read if you want to customize it but the default is to read all the lines and return them as a list. Please, use this function carefully. In fact, if your file is very large, it could have detrimental effect on your system memory. This is because to return the lines, it first needs to create a list of all the lines and this takes memory space.

An example to show how the readlines method works.


with open('eba.txt', 'r') as fobj:
    s_list = fobj.readlines()
    print(s_list)

Which gives the following list as output:


['Nothing beats a plate of eba\n', 
'as we generally want to eat it\n', 
'but it often gets stuck in our throats\n', 
'where it is eaten without a good soup\n', 
'that is oily and makes the eba\n', 
'which is a gelly and very hard \n', 
'to move smoothly down our throats']

It is recommended that you avoid using readlines because there are other ways to go about reading all the lines from your files without impacting on memory. One of them is to use a python for loop to iterate through the file object. This is because a python file object is already an iterable.

The above could be achieved with the following for loop code:


with open('eba.txt', 'r') as fobj:
    for line in fobj:
        print(line)

We have been reading and reading from files. Now, we want to write to files. We will now use the python write to file method.

The Python Write to File Method

The syntax for the python write to file method is write(s) which specifies writing the string, s, to the file and returning the number of bytes written.

The ability to write to the stream or file depends on whether it supports writing. To make this possible, we need to specify this support when opening the file and creating a file object. This is made possible by specifying the writable mode on the open file function (the open file function was explained above). The writable modes are:

r+ Update the file i.e read and write to the file. When the write function is called, it writes the specified string , s, to the beginning of the file.
w It truncates the file first and then writes the string s to the file. You lose all your former file contents with this mode.
a Append the string, s, to the end of the file. It writes onto the last line. If you want it to write to a new line at the end, you need to add a newline character at the beginning of the string, s.

Now, compare the following codes on your machine and see how they run:


with open('eba.txt', 'a') as fobj:
    s = fobj.write('This line was written.')

with


with open('eba.txt', 'w') as fobj:
    s = fobj.write('This line was written.')

and with:


with open('eba.txt', 'r+') as fobj:
    s = fobj.write('This line was written.')

You will notice that the way contents of the file, eba.txt, was written to differs based on the specified mode of the open function. The python write to file method is one of the methods you will most often use when working with files.

The Python seek function

With this method, you can change the current stream position so that when you call the python read file or python write to file methods, it doesn’t carry out those operations from the start of the file which is the default. The syntax for the python seek method is seek(offset, whence=SEEK_SET) where offset is the position you want the stream to go to. Seek method returns the current position of the stream.

For example, if you want to read the eba.txt file from the 35th byte or character in the file and then output the next 55 characters or bytes, you could change the current stream position using seek to be 35 and then do a read with size 55. Here will be the code:


with open('eba.txt', 'r+') as fobj:
    num = fobj.seek(35)
    s = fobj.read(55)
    print(s)

The output you would get from the eba.txt file is:

generally want to eat it
but it often gets stuck in ou

Showing just those 55 bytes of characters.

The last method we will consider is truncate.

The Python truncate file method

With the python truncate file method, you are able to change the size of the file. The syntax for the truncate file method is truncate(size=None) where size is the new size of the file. Where size is not specified, the file is truncated from the beginning of the file to the position of the stream. If size is lesser than original file size, the file is truncated but if higher than original, the size is extended and the unfilled areas are filled with zeros. For the python truncate file method to be operational, the file must support updating or writing, which you have to do by making the file open in writable mode as described above.

The truncate method acts like the write method.

So, I have given you ideas on what you can do with your files and file objects. The next post will be on how to handle python directories. Please, watch out for it. And subscribe to this blog so you can get regular updates when I post new articles.

Happy pythoning.

Utilizing Python reduce and accumulate functions as accumulators

Accumulators have a notable reputation in computing history. The earliest machines by Gottfried Leibniz and Blaise Pascal were based on the concept of accumulators. If you are familiar with your python functions, you would know that the python sum function acts as an accumulator when it comes to addition. But I would like to explain two functions in this post that you can use as accumulators for any operation. These functions are the python reduce function from the functools module and the python accumulate function from the itertools module.

python reduce and accumulate functions
 

The basic function of these two functions is that they take a function and an iterable as arguments, and sometimes an initializer, and then successively carry out the operations of the function on two items in the iterable at a time, storing the result in a variable, and then doing the operation on the next item, storing the result and so on and so forth until you get to the final item and then output the final result. They have different ways of working though, which I will explain.

First, I will start with the python reduce function.

The python reduce function.

The syntax of the python reduce function is functools.reduce(function, iterable[, initializer]) and what it does is to apply the function to the items of the iterable from left to right, and it eventually reduces the iterable to a single value. The function returns the accumulated value of the result returned by the operation of the function that serves as its argument. The function must take only two arguments. The initializer parameter is optional. I will explain it below.

Let’s take the simplest accumulator, the sum function using a lambda function, and see how we can use it to illustrate how the reduce function works.

What the reduce function does is that it is using the lambda function to sum up the items of the iterable, this time, a list. First, it takes 1, the first item and binds it to x, then 2 and binds it to y, then it adds x + y, i.e 1 + 2 and binds the result, 3, to x again. Next it takes 3, and binds it to y, and adds x + y which this time is 3 + 3 which equals 6 and then it gives you the total result. So, you can now visualize how the successive addition is carried out.

You can click the following links if you want a refresher on python lambda functions or on python iterables.

As you can see from the syntax above, sometimes you can supply an initializer to the reduce function. The initializer takes the first value when the function is called. And if the iterable is empty, the initializer will serve as the default.

Let’s use an example with an initializer and see how it runs. This time we want our initializer to be 4.

You can see from the example above that the result of the summation of the list becomes 10. This is because we used an initializer of 4. What is happening here is that when reduce runs, it first binds the initializer to x, therefore x becomes 4. Then it binds y to 1 and sums them to give 5 and binds this result to x. It then binds y to 2 in the list, the next item, sums x and y to give 7 and binds this result to x. it then binds 3, the next item in the list to y and then sums x and y to give 10, the final result. It then returns 10.

Simple, not so. Very easy and fascinating. But don’t be in a hurry. It gets more fascinating when you realize that you can carry out operations on just anything you want. I used sum function to make you get acquainted with this. Any program that needs to accumulate successive results can be used with the reduce function.

Let’s take an interesting amortization example. If I owe $1000 and I pay off $100 annually at an interest rate of 5%, how much would I be owing at the end of four years? Reduce can help you get the result quickly. Let’s see how.

From the code above, you can see that I used an initializer of 1000 and the iterable was a list with the regular payments as the items.

Now, the question comes - since accumulators store successive results before giving the total result, can we be able to get those results before the total? Yes, we can. That is when the python accumulate function from the itertools module comes in.

The python accumulate function.

In fact, you can say that the python reduce and accumulate functions are cousins except for one difference: python accumulate gives you the ability to get the result of successive operations instead of having to wait for the final result. It acts like a generator in this instance.

The syntax of the python accumulate function is: itertools.accumulate(iterable[, func, *, initial=None]). As you can see from the syntax, the python accumulate function uses an iterable to create an iterator and applies the function to each of the elements in the iterator. That is what gives it the behavior of a generator. To get refreshers on these two concepts, you can check out this post on iterators, and also this post on generators. Just like for the python reduce function, the function used in the python accumulate method should be a function that accepts only two items and operates on these two items. Python accumulate method also takes an initializer, the initial argument, which is optional.

So, using the accumulate function, let’s do our amortization again but this time returning the results of successive accumulation instead of waiting for the final or total accumulation.

If you read the code above, you will notice that I cast the iterator returned by the python accumulate function to a list so that I can print out each of the results. Also, one feature of the accumulate function is that it returns the first item in the cashflow list, so during the iteration of the amount owing list, I ignored this first item. Apart from those two notations, we have our results just similar but a little differently from the python reduce function. This time, we can calculate the balance due at the end of each year rather than wait until the end of the fourth year.

If you notice when the yearly balance printed, each of the amounts was to two decimal places. I did that with a nice python string formatting syntax, {amount_owing_list[i]:.2f}, on line 8. To learn how, you can read an earlier post on python string formatting Part 1 and python string formatting Part 2 and you would be sure to be able to do it yourself.

So, that’s it. You can see that python as a language has powerful capabilities. Go experiment with it. Have fun with python.

See you at the next post. If you want to receive new post updates, just subscribe with your email. Happy pythoning.

Using Python Regex To Validate Roman Numerals

Python regex, or sometimes called python regular expressions, are expressions written in python that are made to match a specific pattern in a string. They are a widely used feature in the world of UNIX and is provided by many programming languages. Python is not left out. Some of the advantages of using python regex are that with just one pattern you can validate any kind of input. Something we will be doing in this post. It keeps your code cleaner because it usually involves fewer lines of code, and furthermore saves you the stress of writing numerous lines of if else statements.

python regex with roman numerals
 

If you want a guide to regular expressions in python and some functions that come with the use of python regex, I will encourage you to read it up in this post, that describes the basic syntax, and then this other post on the methods we will be using, the python re match method.

In today’s post, we are going to show how to use python regex to validate Roman numerals based on its rules.

Roman Numerals and Its Rules

Roman numerals are a numeral system that originated in ancient Rome. They were popular and became the usual ways of writing numbers even down to the late middle ages in Europe. The numbers use Latin alphabets to represent numbers and these alphabets are combined according to set rules. In the modern usage of Roman numerals, seven alphabets are used to designate numbers and they are:

Symbol Value
I 1
V 5
X 10
L 50
C 100
D 500
M 1000

Some of the rules for writing valid Roman numerals which we will be using for validation are:

  1. The Roman numerals I, X and C can be repeated up to 3 times in succession to form the numbers but repetition of V, L, or D is invalid.
  2. To form numbers a digit of lower value can be placed before or after the digit of higher value and digits of lower value that can be used for this are I, X, and C.
  3. You should add up all the digits in a group when a digit of lower value is placed after or to the right of a digit of higher value. Digits of similar values placed together are also added.
  4. Subtract the value of lower digit from the value of higher value when a digit of lower value is placed to the left or before a digit of higher value. Note that V is never written to the left of X.

So, now that we have the rules we need to form the python regular expressions, let’s do the Roman numerals validation which is the juicy part.

Validating any Roman numeral

When you run the code below, you need to input a string as a Roman numeral when you are prompted. You will get a result indicating whether the string is a valid Roman numeral or not. If it is an invalid Roman numeral, you will get a message that says: “Invalid Roman Numeral” but if it is valid, you will get a message that says: “Your roman numeral was valid. Welcome.”

Now, let’s run it and have fun. After you have tried running it, I will give a brief explanation of the lines of code. Note that this code takes only 8 lines. If I had needed to use a python if else statement, that would have taken more than that which would not be clean.

Now, that you have taken some time running the above code and seeing how it works, let me explain some of the parts. I think I don’t need to explain the python re match method because you have read it from the link I gave above. So, I will just explain the pattern.

The key to the pattern matching above is the python regex pattern which is denoted as:

regex_pattern = r"^(?=[MDCLXVI])M*(C[MD]|D?C{0,3})(X[CL]|L?X{0,3})(I[XV]|V?I{0,3})$"

The ^ symbol that starts the pattern states that we should start from the beginning of the string while the corresponding $ symbol at the end says that we end at the end of the string. So, we presume that each string passed to the code will only be a single regular expression pattern, otherwise you will get invalid code. Now after the ^ symbol is a lookahead assertion, (?=[MDCLXVI])). Read up this blog post on python lookahead assertions if you want a refresher.

What the python lookahead assertion does is that it says starting at the beginning of the string we want to look ahead and state that any symbol we will be getting must either be an M, D, C, L, X, V or I. Yes, the only symbols that should be allowed to start the string are the seven symbols of the Roman numerals and nothing else. Note that the characters in python lookahead assertion are not captured. So, right now, we have not captured any match.

The next symbol is to match the thousands place. I denote this with the pattern: M*. It states that for the thousands place in the number, we need to match for M either 0 or more times. If the number is not a thousand or multiple of it, then M is zero but if it is then M is 1 or more, so we get a match for this. Unfortunately, I cannot guarantee you that this pattern will match beyond 3999, this is because from 4000, we need a very special thousand Roman numeral symbol to denote this which the pattern cannot cover. But you can try 1999 (MCMXCIX) and see that it matches. Because of the limitation in the thousands place, we could replace M* above with M{0,3} to state that we cannot go beyond 3999.

The next symbol to match is the hundreds place from 100 to 999. I denote the hundreds place with (C[MD]|D?C{0,3}) pattern. What this pattern says is the for a hundred place match, either C (100), should be to the left of M (1000) or D(500), or C should come after an optional D (500), but not more than three consecutive Cs.

The next is the tens place which runs from 10 to 99. The symbol for it is: (X[CL]|L?X{0,3}). This states that the tens place can either be an X (10) before a C (100) or L (50), or it can come after an optional L (50) and if this is the case in not more than 3 consecutive Xs.

The next is the units place which is between 1 and 9. Remember there is no 0 in roman numerals. The symbol for it is: (I[XV]|V?I{0,3}). What the symbol is stating is that the units place is denoted either by an I (1) appearing to the left of an X (10) or V (5), or it appears to the right of an optional V (5) and if that is the case not more than 3 times.

Well, that is it. Enjoy validating your Roman numerals with this simple tool.

I hope you do leave a comment about your results.

Happy pythoning.

Using The Python String Format Method: Format Specifications Part 2

In an earlier post, I showed how to use field names and conversion fields to format values in replacement fields. Today, I will continue that discussion by showing how to use format specifications, the optional and last feature of replacement fields, in the python string format method.

The format specification for python string format method
 

The format specifications represent how the value in the replacement field should be presented. It includes details such as the width of the field, its alignment, padding, conversion etc. Each value type is given its own specification. Also note that each format specification can include nested replacement fields but the level of nesting should not be deep.

You use a colon, :, to denote the start of a format specification. The format specification has 8 flags and I will denote each of them in their order of precedence. Note that each of the flags are optional.

  1. The fill flag
  2. This flag is used as the first flag. Use it to denote what you want to use to fill the space in the presentation of the value of the object. Any character can be used as the fill character and if it is omitted, it defaults to a space. Note that a curly brace cannot be a fill character except the curly brace is in a nested replacement field. The fill character kicks in when the value of the object cannot fill the specified width of the replacement field otherwise it doesn’t apply. So, you use it with other flags.

  3. The align flag
  4. The represents the alignment of the value of the object. You could either right align, left align, center or cause a padding to fill the available space. The different options are presented below:

    < Used for left alignment of the value in the available space. The default for most objects
    > used for right alignment of the value in the available space. The default for numbers.
    = Forces a padding to be placed after the sign but before the digits. Only valid for numeric types. If you precede the field width (explained below) with 0, then this becomes the default.
    ^ forces the value to be centered within the available space.

    To make the alignment option meaningful, you must specify a minimum field width. Here are some examples. They all come with minimum field width of 20.

  5. The sign flag
  6. This is only used for numeric values. The various options are:

    + Use a sign for both positive and negative values.
    - Use a sign only for negative values (this is the default behavior)
    Space Show a leading space for positive numbers and a minus sign on negative numbers.

    Here are some examples.

  7. The alternate flag, #.
  8. Use this flag when you are doing value conversion and you want the alternate option to be specified. It is valid for integers, floats, decimal and complex types. We will come back to this when we get to the conversion flag and show how the alternate forms can be specified.

  9. The grouping flag.
  10. The grouping flag specifies the character to be used as a thousands separator. It has two options:

    _ Use this as a thousands separator for the integer types when ‘d’ is specified as the type flag (to be explained later) and floating point types. When the type flag for integer types is either ‘b’, ‘o’, ‘x’, or ‘X’, the separator is inserted after every four digits.
    , Use a comma as the thousands separator. You could use the ‘n’ type flag instead if you want a locale aware separator.

    Now some examples. I included the third example with ‘b’ as a type flag. ‘b’ as type flag means convert value to base 2. This will be explained below under type flags.

  11. The precision flag
  12. The precision flag is a decimal number that indicates how many digits should be displayed “after” the decimal point for a floating point value that has the type flag ‘f’ or ‘F’, or before and after the decimal point for a floating point value that has the type flag ‘g’ or ‘G’. Note that there is no precision for integer types. If the value is a non-numeric type, then this indicates the maximum field size of the replacement field.

    Now for some examples. Notice how it truncates the string type, s, when the precision is smaller than the number of characters.

  13. The type flag
  14. The type flag determines how the data should be presented. The type flag is specified for string types, integer types, and floating point types.

    For string types: The available options are...

    s The default type for strings and may be omitted
    None The same as ‘s’

    For integer presentation types: The options are...

    b Outputs the number in binary format
    c Converts the value to the corresponding Unicode character before printing.
    d Output the number in base 10 before printing.
    o Octal format. Output the number in base 8 and print.
    x Hex format. Output the number in base 16 using lower case letters for digits above 9
    X Hex format. Output the number in base 16 using Upper case letters for digit above 9.
    n Decimal format. The same as ‘d’ but it uses locale aware setting to insert appropriate thousands separator for the locale.
    None Same as ‘d’

    Note that except for ‘n’ and None, you can use any of the options above in addition to the floating point types below for integers. That is, you can have a mixture of both integers and floating points.

    Now, let’s use some examples.

    When discussing the alternate flag, #, I stated that there are times when you want alternate conversion forms to be specified. For example, for binary, octal and hexadecimal outputs the alternate flag, #, will result in an output of ‘0b’, ‘0o’, and ‘0x’. Let’s show this with examples.

    The alternate flag can also be applied to floats and complex numbers.

    Now finally, the options for floating point presentation types are:

    e Exponent notation. Print the number in scientific notation using the exponent, e, to denote it. The default precision is 6.
    E Exponent notation. Print the number in scientific notation using the exponent, E, to denote it.
    f Displays the number in fixed point notation. The default precision is 6.
    F Fixed point notation, just like ‘f’ but converts nan to NAN and inf to INF.
    g This is the general format. Uses fixed point or scientific format depending on the magnitude of the number.
    G General format, but in uppercase.
    n Same as ‘g’ but is locale aware in inserting appropriate thousands separator.
    % Percentage. Multiplies the number by 100 and displays it in fixed format, ‘f’, with a percent sign (%) following it.
    None Similar to ‘g’ except that fixed point notation when used has at least one digit past the decimal point.

    The following examples uses precision 2 then the default 6.

I hope you get creative in using this format specifications. They are very helpful when representing values. Note that python’s literal string formatting method, f-strings, are similar to the python string format method described here. You can interchange the two.

Using The Python String Format Method Like A Pro Part 1

How you format your text is important in text processing and python is not left out, giving you several options to make your output appear presentable. I decided to delve into the issue of python formatting in today’s post while reading some code. I appreciated the way the author applied python string formatting. So, I decided to devote two posts to string formatting because I believe my readers would be interested in it.

python string format method makes output presentable
 

In python you format your output using the format method of the string class. What is also called the python str.format method (or python string format method) to differentiate it from the python literal f-strings. A format string contains two types of features that would have to be sent to the output: literal text and replacement fields. Replacement fields are surrounded by curly braces, {}, and refers to objects that have to be formatted, while literal text refers to whatever you want to leave unchanged in the output. So, what we are interested in are replacement fields.

To give you an idea of what replacement fields are, read and run the following code:

You will see that in the string part of the python format method in the code above, there are two curly braces and they serve as replacement fields whose values are provided by the parameters, name and age, of the python format method. We are going to be discussing how you can format your output based on the replacement fields and parameters.

The syntax of the python string format method

The syntax of the python string format method is: template.format(p0, p1, k0=v0, k1=v1) where template refers to the string you want to format. As I said before, the template consists of both literal text and replacement fields. Replacement fields are denoted by whatever is in curly brackets, {}. The arguments p0 and p1 refers to the positional arguments while k0 and k1 refers to the keyword arguments. Positional and keyword arguments are used to insert values into the replacement fields in the template. We will cover all these and give you ideas on how to use them.

The replacement fields have three optional features: field names, conversion fields that are preceded by an exclamation point, !, and format specifications. Today’s post will cover how to specify the field names and conversion fields while the next post will be on format specifications.

The field names in the string replacement fields.

The replacement field starts with an optional field name. The field name refers to the object whose value is to be inserted. The object is specified in the parameter of the format method. The field name is either a number or a keyword.

  1. Where the field name is a number:
  2. An example to illustrate this is below:

    
    name = 'Michael'
    age = 29
    print('Hello, you name is {0} and your age is {1}'.format(name, age))
    

    You can see that in the template above, there are two curly braces or replacement fields. The first has the number 0 and the second has the number 1. The curly brace with 0 refers to the first positional argument which is found as a parameter to the format method and here this is the variable, name, while the curly brace with 1 refers to the second positional argument which is the variable, age.

    If you so desire, you can choose to leave out the numbering of the curly braces and python will insert them on your behalf. Like this:

    
    name = 'Michael'
    age = 29
    print('Hello, you name is {} and your age is {}'.format(name, age))
    
  3. Where the field name is a keyword.
  4. The python string format method provides for instances where you can specify keyword arguments as parameters and the replacement fields requires you to specify the keywords. An example is below:

    print('Hello, you name is {name} and your age is {age}'.format(name='Michael', age=29))

    You can see now that I have inserted the keywords into the curly braces because the parameters are keyword arguments.

    Using keywords as arguments is super powerful. It gives you the ability to change the ordering of the parameters in the replacement fields. For example, instead of following the ordering of the positional arguments, I could order the replacement fields as it suits my fancy:

    Check out the code above and the one before it. See how I interchanged the ordering of the keyword arguments in the replacement fields. We could try another example to show you how powerful this is.

    print('In {country}, there are {number} million people speaking {language}.'.format(language='English', number=300, country='USA'))

    Now, let’s insert it into the embedded python interpreter so you can run it:

    With keyword arguments you are not constrained to any sort of ordering. You choose how you want it to be. You can check out this post if you want a refresher on positional and keyword arguments.

    Note: What if you want to have the brace as a literal text in the template? Simple, just double brace it.

    print('This is doubling the braces {{{name}}} for {name}'.format(name='Michael'))

    I doubled the braces for the first replacement field. Let’s run it to see how it would appear on the embedded interpreter.

    When you run it, you will notice that braces now literally appears in the output.

    Now, what if your parameters are lists or an object with attributes whose value you want to show on output? The next two sections below will show you how.

  5. Where the parameter to format is a list.
  6. To make the output appear as you want it to, you can specify the parameter as a keyword argument or a positional argument. Look at the code below and see how. First, I specify it as a keyword argument. That means, you need to implicitly specify the list in the parameter and index it in the replacement field. But if you want it as a positional argument, you need to specify the index as parameter.

    What python does when you specify it either way is to call the __getitem__() method of the list. I discussed about this method in an earlier post on sequences.

  7. When the object has attributes with values.
  8. When the object in the parameter has an attribute whose value you want to format, you can directly call the attribute in the replacement field. The code below shows how in the method get_fruit. What the 0.index and 0.fruit does is call the getattr() function of the object, self, in order to get the required value. In the code below I created a fruit class with a class attribute, index, so that whenever a fruit is created it is tagged with an index (instead of creating a list) and then the index is incremented to tag the next fruit.

Be creative. Play with your own objects to test how format calls attributes from the replacement field.

I think that’s all for field names. After the field names come an optional conversion field.

Syntax of the conversion field

The conversion field is optional, but if specified, it is preceded by an exclamation point, !, to differentiate it from the field name. It causes type conversion before any formatting of the replacement fields takes place. But one may ask – doesn’t every object have a default __format__() method? Yes, they do. But the creators of python realized that sometimes you want to force a specific string representation of an object.

There are three types of specifiers for the conversion field: !s, !r, and !a specifiers.

  1. The !s specifier:
  2. The !s conversion specifier gives you a string representation of the object in the replacement field. What it does is call str() on the object in the replacement field, converting it to a string. This is the default string formatting.

  3. The !r specifier
  4. You can use this when you want the true string representation of an object to be specified, and not just outputting it as a string. This representation contains information about the object such as the type and the address of the object. This specifier calls the repr() method of the object.

  5. The !a specifier
  6. This specifier also outputs the true string representation of an object but it replaces all non-ascii characters with \x, \u or \U. This specifier calls the ascii() method of the object. It works like the !r specifier if you have no non-ascii characters in the object.

Here is an example illustrating all three types. Notice how the object type appeared in the output for !r and !a.

As another illustration, you can compare the output of the !s and !r in a string with quotes showing or not showing.

In my use of the conversion fields, I have found that making them optional has served me well. So, they just come in for special cases of formatting.

Now, the third and last feature of the replacement field option is the format specifier which is explained in this post. This is where the real juice of replacement fields are stored.

Matched content