Search

Python Modules And Libraries: The Essentials

The pillar of modular programming in python is the module. A module, in its simplest form, is any script that has the .py extension attached to it. Each of these files has classes, functions and definitions which can be used by anyone who is using the module. Another name for modules in python programming are libraries. You must be familiar with several python libraries or modules like the os, math, numpy, matplotlib and random module because I have written articles on some of them. In this post, I will dwell on the essentials of python modules and libraries along with their benefits.

python modules and libraries

 

The benefits of python modules and libraries.

Imagine having to write every code yourself for any task you want to carry out if the function is not built-in. That would be time consuming, not so? Modules increase our productivity by allowing us to concentrate on the specific parts of our code which no one has written before. Since a lot of programmers are writing code all the time, we can leverage on our own productivity by being able to use their code rather than reinvent the wheel.

Also, python modules increase bug reporting. When classes, functions and definitions are defined within their namespaces in a module, you can easily isolate a bug in your code rather than bundling everything into one namespace in one file.

Modules also increases readability of code. It is easier to read a code that has fewer lines and is importing from other modules than to read a code that has so much lines that your eyes get tired. This is what modularity is about.

Getting Python modules list in your computer.

To get a listing of all the available python modules in your machine, at the python interpreter shell you can issue the command: help(‘modules’) and it will give you a list of all available python modules that are on your machine.

You would get a list such as the one from my machine whose graphic is below:

python modules and libraries


How to use modules in your own programs

For example you have a module or python library whose functions or classes you want to use. Imagine you are doing some math calculations and want to be able to use the pi definitions and sine functions and you know that a module named math has those definition and functions, how do you implement the module?

To do so, we use the import keyword. All you would have to do is to import the math module this way:

    
import math

Here is the syntax to import any module into your programs:

    
import module_name

Where module_name is the module you are importing into your program. When you import a module, python program loads the definitions from the module into the program’s namespace. In doing so, it gives you the ability to use definitions from the module in your program. For example, in the math module after importing it, I would want to use the pi definition as well as the square root definition. This is what I would do.

Notice in the program above that I used the definitions for pi and the square root function, sqrt(), that are included in the math module without having to write them myself. That is the power that modularization gives you.

Notice that in the code above, I first qualified the definitions in the module with the module name. This is to prevent name conflicts if I needed to use the same name in my program. But where you are sure there would be no name conflict, you could make it possible to use only pi or sqrt() without qualifying them by specifically importing them from the math module this way:

    
from math import pi, sqrt 

print('Pi is:', pi)

print('The square root of 36 is:', sqrt(36))

Notice that when importing the math module I used the keyword: ‘from’ and then specifically imported only the pi and sqrt function. It enabled me to use them without qualifying these definitions.

Most times, in order to avoid ambiguity, I rather prefer qualifying definitions rather than not qualify them.

There are times when the name of the module is very long and we don’t want to be using such a long name often. Then, we can assign an alias for the module name which will serve as a short hand notation. For example, in my xml parsing program from another post, the module name was very long, so I had to alias it using a short hand of the form:

    
import xml.etree.ElementTree as etree 

tree = etree.ElementTree(etree.fromstring(xml))

You can see that each time I wanted to use that module as qualifier, I used the shorthand rather than the complete long name. Let’s use a short hand for our math module:

Importing Modules that are not in your machine

What if a module you want to use is not already installed on your machine but you know the name of the module? Not to worry because python has a convenient way of installing modules into your machines that is easy. The program is called pip.

To import a module with module_name onto your machine, just use the following command on your terminal:

    
python -m pip3 install module_name

And pip will get the process running. It will even give you a report of how it is installing the said module in the background.

How to create your own modules

You can easily create your own modules with their definitions and functions. All you need to do is just write the definitions and functions in a file, name the file whatever name you wish and let the extension be .py. That’s all. That file can now be used as a module. If you want another program to use the definitions in the module, just save this file in the PYTHONPATH directory or save the file in the same directory as the program that will use it. That’s all.

Let me walk you through the process.

Supposing you wrote this cool factorial function, my_factorial(), and you want to be able to reuse them in subsequent programs. Then you could make it part of a module and name it fact_func.py. Then, save it in a directory where a program would use it. Then all you have to do is ask the program to import fact_func module and then you can use the my_factorial function.

When commands in a module should not be executed

There are times when you have written a module and you don’t want any program importing the module to have access to some definitions or functions when it is imported except when the module is used directly. For example, this might apply to some commands for testing the module. Python has a handy way for asking programs importing a module not to execute some commands. What you do is to insert those statements or commands in a separate section of the file. This section is delimited by the statement:

    
if __name__ == '__main__':
    '''statements in this section
    are not executed when the module
    is imported'''

Notice how I used the if statement above to delimit the section that instructs python not to execute any statement or function in that section.

Let’s do this with examples. For example, let’s take our factorial function. After writing the factorial function, I would want to write a function that tests the factorial function to make sure it is working properly. I don’t want any other program importing this module to see this test function but I want to keep it for record purposes in case I need to make changes to the factorial function in the future, then I could do the following:

Notice that I created a separate section for testing it beginning with the if… statement above. Then ran that test at that section. When anyone imports this module, they would not be able to see the test; I wouldn’t want anyone to see it except when I am debugging or rewriting the function, or if the module is used directly and not when imported.

What are the best python modules

I often read online about the top ten python modules and libraries everyone should know. That is just a fallacy and click bait. There is no top ten. Modules are created to fill a void where one exists. You choose a module provided the definitions and functions can help you in your program. If such a module has definitions and functions that you need, then it is a module that you would use otherwise you have no business with it. So, the top module for any programmer is the module that he would use to get a task done better.

But when it comes to data science, there are some modules which do the work best you have to be aware of like matplotlib for data visualization and plotting, as well as numpy for multi-dimensional arrays and numerical programming.

So, I hope I helped increase your python knowledge with this post.

Happy pythoning.

Python Abstraction: A Guide

The notion of abstraction comes from the ability to distil a complicated system into its component parts or fundamental parts. It involves hiding the complexity and showing only the fundamentals. For example, take a car. A car has several functionalities when you enter inside. It has a steering, a brake, a seat, lights etc. When abstracting a car, you break down these component parts and give them names that you use to describe them as well as their functionalities that makes up the whole, the car.

python abstraction

In python, when you specify a data type as an abstract data type, you are specifying the type of data that is stored, the operations supported on it, and the types of the parameters on the operations. This is called the public interface of the abstract data type. And python has a great deal of latitude in providing the specifications for an interface. In python, interface specification uses duck typing. That means, you don’t check the data types at compile time and there are no formal requirements that abstract classes need to be declared. Instead a programmer would have to assume, when in python, that an object supports a known set of behaviors which if they fail the interpreter will raise an error. Duck typing is attributed to a poet, James Whitcomb Riley, who stated that “when I see a bird that walks like a duck and swims like a duck and quacks like a duck, I call that bird a duck.”

Mechanism for supporting abstract data types.

Abstraction and interfaces in python are generally implemented by using abstract classes and interfaces. Formally, python provides an abstract base class (ABC). An abstract base class cannot be instantiated, but it defines one or more common methods that all subclasses of the class must have. You realize an ABC by declaring one or more concrete classes that inherit from the abstract base class while providing implementation of the methods declared by the ABC. To declare a class as an abstract base class, ABC, that class needs to inherit from the abc module, a module that provides formal support for ABCs.

When an ABC needs to be an interface, it needs to just provide the method names without any method body. Also, the method is decorated with @abstractmethod to tell python interpreter to look for implementation of that method in the subclasses or concrete classes inheriting from that ABC.

Let’s give an example of data abstraction in python. Let’s say there is a bank providing three methods of payment - payment by credit card, payment by mobile phone, and payment through social media like whatsapp. The banking app could provide a Payment ABC which is declared such and then provides an abstract method, payment, which needs only be implemented by concrete classes inheriting from this payment class.

Let’s code the Payment class ABC as a python abstract class.

    
from abc import ABC, abstractmethod

class Payment(ABC):

    def print_receipt(self):
        print('Receipt printed for payment', 
                  self.__class__.__name__)

    @abstractmethod
    def payment(self):
        pass 

You can see from the code above that I first imported ABC from the abc module. Then the Payment class inherits ABC class in that module, making it a python abstract class. The Payment class has two methods: a concrete method, print_receipt, that applies to all instances of the class including subclasses and an abstract method, payment, which will have to be implemented by concrete classes inheriting from it. I decorated the abstract method with the @abstractmethod decorator to tell python what to expect. Notice that in the print_receipt method, I made a reference to the class name of the object calling the method so that we can be able to find out which subclass we are printing the receipt for.

Now, let’s write concrete classes that inherit from the Payment class and see how they implement the payment method, abstracting its functionality so they can define their own way of payment.

You can see now that each subclass only needs to implement the method declared as an abstract method. If they don’t, python will give an error. After implementing that class, they behave differently while still being recognized as members of the Payment class.

Whenever an object which implements the abstract method invokes the abstract method, its own implementation is called and executed. That’s how python abstraction works. So, you don’t have to be peering about what is happening under the hood because all you need to know is that mobile payment, credit card payment and whatsapp payment work.

I believe you now have a good working knowledge of how abstraction in python works and can implement your own abstract classes and interfaces. Data abstraction in python allows your class to have several objects doing the same thing in their own distinct way. You need not know how it works but just that it works. Isn’t data abstraction in python beautiful? It’s a cool feature of the language.

In fact, python abstraction is one of the OOP concept every programmer needs to know. I have covered other OOP concepts like inheritance and also polymorphism in other posts.

Happy pythoning.

Constructing An XML Parser In Python

XML, or extensible markup language, is a markup language defining a set of rules for encoding documents such that they can be both human-readable and machine-readable. The World Wide Web Consortium has published a set of standards that define XML. You can reference the specifications here. Although XML was initially designed for documents, its use case has included several types of media and files.

python xml parser

 

A well-formed XML document among other things would include elements and attributes. An element, like in HTML, is a logical document component that begins with a start-tag and ends with an end-tag. A start-tag is denoted as <tag_name> while the end-tag as </tag_name>. An empty tag is a combination of both and is denoted as <tag_name />. An element could also have attributes within the start-tags or empty tags. Attributes are name-value pairs for the document and each name can only have one value. Example of an element with an attribute is <subtitle lang=’en’> where the subtitle element has the lang attribute with ‘en’ value. At the top of the XML document is a root element which is the entry into the document.

Now, in our code that parses XML we will only be dealing with elements and attributes.

To parse XML, python has an API for doing that. The module that implements the API is the xml.etree.ElementTree module, so you would have to import this module into your python file to use the API.

What the xml.etree.ElementTree module contains

This module is a simple API for parsing and creating XML in python. Although it is robust, it is not secure against maliciously constructed data. So, take note. Among several classes of interest, for our parsing activity we will be concentrating on two classes in this module – ElementTree which represents the whole XML document as a tree, and Element which represents a single node in the tree.

To import an XML document you could import it from a file or pass it as a string.

To import it from a file use the following code:

    
import xml.etree.ElementTree as etree
tree = etree.parse('data.xml')
root = tree.getroot()

while to get it directly from a variable as a string use the following code:

    
import xml.etree.ElementTree as etree
xml = 'data as string'
root = etree.fromstring(xml)

The root variable above refers to the root element in the XML document.

The ElementTree constructor

We will be using the ElementTree constructor to get to the root of our XML document, so it is worth mentioning here. The syntax for the constructor is xml.etree.ElementTree.ElementTree(element=None, file=None). The constructor can accept an element which serves as the root element as argument or you could pass it a file that contains the XML document. What it returns is the XML document as a tree that could be interacted with.

One interesting method of this class is the getroot() method. When you call this method on an ElementTree root, it returns the root element in the XML document. We will use the root element as our doorway into the XML document. So, take note of this method because we will be using it in our parsing code below.

That’s all we need from ElementTree class. The next class we will need is the Element class.

Objects of the Element class.

This class defines the Element interface. It’s constructor is xml.etree.ElementTree.Element(tag, attrib={}, **extra). But we will not be creating any elements but just using the attributes and methods. Use the constructor to create an element. But you can see from the constructor definition that an XML document element has two things: a tag and a dictionary of attributes. Objects of this class defines every element in the XML document.

Some interesting attributes and methods we will be using from this class are:

a. Element.attrib: This returns a dictionary that represents the attributes of the said element. What is included in the dictionary are name-value pairs of attributes in the Element or what some call Node in an XML document.

b. Element.iter(tag=None): this is the iterator for each element. It recursively iterates through the children of the element and gives you all the children, even the children of its children recursively. You could filter which result it can give by providing a tag argument that specifies the specific tag whose children you want to receive information about. It iterates over the element’s children in a depth-first order. But if you do not want to get the children in a recursive fashion but only want the first level children of any element, then you can use the next method below.

c. List(element): This is casting an element to a list. This casting returns a list of all the children, first level only, of the element. This method replaces the former Element.getchildren() method which is now deprecated.

So, I believe you now have a simple introduction into some of the features of the xml.etree.ElementTree module. Now, let’s implement this knowledge by parsing some XML documents.

The XML document we are going to parse is a feed for a blog. The XML document is given below:

    
<feed xml:lang='en'>
        <title>SolvingIt?</title>
        <subtitle lang='en'>
               Programming and Technology Solutions
                     </subtitle>
        <link rel='alternate' type='text/html' 
         href='https://emekadavid-solvingit.blogspot.com' />
        <updated>2020-09-12T12:00:00</updated>
        <entry>
            <author>
                <name>Michael Odogwu</name>
                <uri>
                https://emekadavid-solvingit.blogspot.com
                </uri>
            </author>
        </entry>
    </feed>   

You can reference this document in the code while reading the code. You can see that the XML document has elements or nodes and the root tag is named feed. The elements also have attributes.

The first task we are going to do is that we are going to find the score of the XML document. The score of the XML document is the sum of the score of each element. For any element, the score is equal to the number of attributes that it has.

The second task is to find the maximum depth of the XML document. That is, given an XML document, we need to find the maximum level of nesting in it.

So, here is the code that prints out the score and maximum depth of the XML document above. I want you to run the code and compare the result with what you would have calculated yourself. Then, after running the code, the next section is an explanation of relevant points in the code along with a link to download the script if you want to take an in-depth look at it.

Now, for an explanation of the relevant sections of the code. I will use the lines in the code above to explain it.

Line 1: We import the module, xml.etree.ElementTree and name it etree.

Lines 23-35: The XML document.

Line 36, 37: Using the fromstring method of the module, we import the xml document and pass it to the ElementTree constructor which then constructs a tree of the document. Then from the tree created we get the root element (or node) so that we can parse the document starting from the root element.

Line 39: We pass the root element to our function, get_attr_number, that calculates the score of the XML document.

Lines 3-8: What the get_attr_number function does is that it takes the root element or node and recursively iterates through it using node.iter() to get all the children, even the nested children. For each child element, it calculates the score for that child by finding out the length of the attribute dictionary in it, len(i.attrib) and then adds this score to the total score. It then returns the total score as the total variable.

Next is to find the maximum depth. In the XML tree, we take the root element, feed, to be a depth of 0. Take note.

Lines 41,42: Here the depth function is called, passing it the root element of the tree and the default level is noted as -1. Then maxdepth, a global variable, is printed out after the depth function has finished execution. I now describe the depth function.

Lines 12-20: When this function is called, it increases the level count by 1 and checks to see if the level is greater than the maxdepth variable in order to update maxdepth. Then for each node or element, if that element has children, list(elem), it calls the function, depth, recursively.

You can download the above code here, xmlparser.py.

Now, I believe you understand how the code works. I want you to be creative. Think of use cases of how you can use this module with other XML functions like creating XML documents, or writing out your own XML documents and parsing them in the manner done above. You can also check out another parser I wrote, this time an HTML parser.

Happy pythoning.

Matched content