Search

Check If Python List Contains Unique Elements Or Items

While going through a Reddit group post on python recently, I saw a question that was similar to the element uniqueness problem and the way the answer was formulated struck me as odd. I thought the implementation was not optimal or efficient, so I volunteered an answer that I knew was better.

python unique elements in list

 

The question was: how do I check to see if my python list contains elements or items that are unique or not duplicates?

And the response code that was given was this:

    
def unique_bad(S):
    ''' S a list or sequence 
    returns True if all the elements are unique
    '''
    for i in range(len(S)):
        for j in range(i+1, len(S)):
            if S[i] == S[j]: 
                return False
    return True

There is nothing wrong with the code because it really does the job, but it does the job badly. How do we know that? Let’s take the worst case scenario for the code. The code has two loops, the outer loop with the i variable and the inner loop with the j variable. Depending on the length of the sequence, the outer loop takes at least n-1 iterations in the worst case. The inner loop takes n(n+1)/2 iterations in the worst case owing from the fact that it does 1+2+3…+n-1+n iterations after the outer loop. So, in the asymptotic sense, the worst case running time for this code is O(n2). This is not really too bad. A quadratic running time for a program is fair enough. But I called it bad because there is a better way to implement this if we know the nature of the problem.

You can see that for each outer iteration, i, the code looks for a duplicate in the inner iteration, j, and this is very repetitive. In the fortunate case where two duplicates lie side by side, the code would exit early but if not, it would take a longer time. So, taking from this idea, why don’t we make it that if there were duplicates they should lie side by side. This could help us to write code that takes a shorter worst case running time.

In fact, we can do so by first sorting the list. When we sort the list, if there were duplicates they would lie side by side and then when we iterate through it looking for duplicates, we would catch duplicates very early.

That is what this better code seeks to do.

    
def unique_better(S):
    ''' S a list or sequence 
    returns True if all the elements are unique
    '''
    new = sorted(S)
    for j in range(1, len(new)):
        if new[j-1] == new[j]:
            return False
    return True        

You can see that I sorted the python list or sequence first and then iterated through the list looking for duplicates. Now the loop runs in linear time rather than quadratic time. If you look through python’s documentation you would discover that the worst case running time of sorted function is O(n long n) while the loop is O(n) time so the code runs in O(n log n) time which is far better and more effective than the earlier code which runs in quadratic time.

So a little understanding of the nature of the problem can lead one to writing better code.

I just wanted to point this out because the question on Reddit intrigued me. It was an example of some of the problems that sorting can easily solve if only the nature of the problem was understood.

Happy pythoning.

Emulating Lists In Your Python Classes With MutableSequence

Python provides for ways we can emulate the native data structures in our own defined classes. In this post, I will show you one way you can emulate a list. A list is a mutable sequence that has items which are ordered by index.

python mutablesequence

 

In my post on operator overloading and dunder methods, I showed you a way a class can behave like a list. But we can extend that concept a little further by implementing a much simpler emulator. This is because in the post I was concentrating on illustrating operator overloading and dunder methods and not on making the emulation simpler.

The recommended way to emulate a list in your python defined classes is to inherit from the MutableSequence class which is found in the collections.abc module. This class will give your classes the power to act like lists with all the methods and functions that you can find in a list. But since this class is an abstract base class (I discussed abstract base classes on the python abstraction post), there are some abstract methods that first need to be implemented by your defined class. These abstract methods are __getitem__(), __setitem__(), __delitem__(), __len__(), and insert. You will notice that these are dunder methods which make your class to have the functionality of a list.

I will use an example to illustrate this process, a Student class, that inherits from MutableSequence. A student object has a name and a list of grades.

So, the constructor for the Student class will go like this:

    
def __init__(self, name):
        self.name = name
        self.grades = []

Then to implement the __getitem__() abstract dunder method from the MutableSequence ABC, we used this code:

    
def __getitem__(self, key):
        ''' retrieves an item by its index, key'''
        return self.grades[key]

It returns the grade or item at the given key or index.

Then to implement the __setitem__() abstract dunder method, we used this code:

    
def __setitem__(self, key, value):
        ''' set the item at index, key, to value '''
        checkvalue(value) 
        self.grades[key] = value

The method is passed the key and value we want to modify and before it does so, it first checks to see whether the value is an int. The checkvalue function is defined within the module. If the value is not an int, checkvalue returns a TypeError but if no error was returned on the check, then __setitem__ proceeds to modify the value at the given key. Here is the checkvalue code:

    
def checkvalue(value):
        if not isinstance(value, int):
            raise TypeError('Pass in an int object')

Now to implement the __delitem__() abstract dunder method, we used this code:

    
def __delitem__(self, key):
        ''' removes the item at index, key '''
        del self.grades[key]

As you can see it simply removes the grade or item at the given key.

Then to implement the __len__() method we used the following code:

    
def __len__(self):
        return len(self.grades)

The statement in the body of the __len__() dunder method just calls on the built-in len method on our grades list and returns the result.

Now finally, we will implement the insert() abstract method to complete the inheritance using this code:

    
def insert(self, key, value):
        ''' add an item, value, at index, key. '''
        checkvalue(value)
        self.grades.insert(key, value)

Using this method you can now insert a grade at any given position in the grades list of the Student class.

Now, our Student class has implemented all the required abstract methods from the MutabeSequence class. That makes it a list emulator. So, with all these done, we can now use the student objects as a list.

Below is a complete code that you can run showing just how it was done. In the driver code, the student object called on the append, reverse, pop and sum functions which are all native list object functions. You can read the driver code and then run it to see how beautifully the student object, michael, now acts like a list with the above mentioned functions.

You can download the code here, student_mutablesequence.py, so that you can try out other list functions to see how it perfectly emulates a list.

Now, you have one more arsenal that shows you the wonderful capability of python to give power to user classes through various means. In fact, you can emulate any container data type in python with the abstract base classes in the collections.abc module. Check it out and experiment with doing so. You will surely find it useful one day.

Happy pythoning.

Matched content