Search

Python List And Sequence Comparisons And Sorting Based On Lexicographic Orders

According to the documentation, comparing sequences is done based on lexicographic order. That is just a way of saying that comparisons between sequences are done based on their position in dictionary order if alphabets, and if they are integers, based on their position in the number line. Comparisons could be done using the lesser than operator, <, the greater than, >, operator, or the equal to, ==, operator. It really gets interesting when you are dealing with sequences that have a mix of both alphabets and numbers. These comparisons and many other comparisons are what we will be discussing in this post. We will also show that the python list sort method and python sorted function are based on comparisons.

Colorful drinks sorted like python lists
 

Note that these comparisons are Booleans. That means, they give you True or False when these items are compared.

Let us compare two lists in python and see how the comparison works on sequences. When objects are to be compared, they must be of the same type. If they are of different types, python will return a TypeError.

  1. When the two python sequences are of the same length and type.
  2. The code above compares n and m which are python sequences of numbers. You can see that they both only differ in their last items in the python list. I just used this example to show you that when python compares two sequences of the same type each index is compared to the corresponding index until a mismatch is found, and then based on the lexicographic order, one could be found to be greater than, lesser than, or equal to the other. In the code above, n was lesser than m because index 2 in n, which is 3, is lesser than index 2 in m, which is 4. Indices start from 0.

  3. When the two python sequences contain items of different types
  4. When the two sequences being compared have items of different types, python will return a TypeError. Note the code below.

    When run, the above code returns a TypeError because string and integer types cannot be compared.

  5. When the two sequences are the same length and contain the same items of the same type.
  6. When you run the code, you would realize that they compare equal. What python does is walk through each item and compare them index to index. In this case, all the items compare equal. But what if one list is shorter than the other and all the items compare equal. What does python decide? See the code below.

    When the code above is run, you would see that python takes the shorter of the two sequences as the lesser one when they compare equal, index for index. It now uses the len function to decide on the equality or non-equality of the two sequences.

I have used python lists in these examples, but you can use any sequence like a python string, tuple, or range.

Comparison of user defined objects

Can we take this notion of comparison to user defined objects? Yes, of course. You can provided your user-defined object has the appropriate comparison method. Or in other words, provided it implements the __lt__(), __gt__(), or __eq__() special methods. If that is the case, you are good to go. Here is an example of how comparison could be done on user defined objects.

When you run the code above, you can see that objects of the Length class can compare themselves even though they are not sequences.

This ability to overload native methods and python operators gives a programmer great power. That power comes with enormous responsibility. One of such power is the ability to use the concept of comparisons to carry out sorting. Python has two functions to do that, and they are the built-in python sorted function and the list.sort function that comes with the python list class. These two functions work based on the concept of comparison to sort items in sequences. We would be using the built-in sorted function since it is generic.

The python sorted function

The sorted function creates a new sorted list from any iterable. By default, it sorts based on lexicographic order and in ascending fashion. Take the following code for example.

When you run it, it sorts the list of fruits in dictionary or lexicographic order. The underlying mechanism at work is a comparison of each of the fruit items. That is why you could change the order of the sort. The sorted function has a reverse keyword argument that you can use to do that. By default, reverse is False but you can switch it to True to sort in reverse lexicographic order. Let’s do it.

After running the above, you can see that when I set the reverse argument to True, it sorted the items in the fruits list in reverse order.

There is also another keyword argument that is useful when you have items in a tuple or a nested list and you want to specify which order to sort the items. For example, if we have a list of tuples comprising names and ages, how do we sort the list such that the ages takes more prominence in the sorting order before the names? This is best defined using the key keyword argument in the sorted function. In the code below, I would use a lambda function to specify what the key should be. Lambda functions are anonymous functions. The lambda function will sort or compare the items in the python list based on their ages.

As you can see, ‘David’ who is 20 years old comes first in the list, followed by ‘Rose’ who is 25, then by the two other students, ‘Michael’ and ‘Daniel’ who are both 32. But there is a problem with the sorting. The sorting is not yet complete. If Daniel and Michael are both 32 and compare equal for ages, then naturally we should expect Daniel to come before Michael in the sorted list. That’s right. So, let’s add one more power to our key. This time, we would tell the key argument to first compare by age, and if ages are equal, to compare by names. The code below shows how it is done. The only difference from the above code is that I added x[0] to the statement in the lambda function and that makes it possible because for each item in the list, x[0] is for names while x[1] is for age. To make them compare equal, I then cast the key for age to a string.

Here is the code.

We now have a well sorted list where ‘Daniel’ comes before ‘Michael’.

Let’s take this a bit further and give more power to sort any object, not just custom data structures like sequences. We could extend this power to our custom Length class that we described earlier. Let us be able to sort any sequence that has Length objects.

This is somewhat simple because I have already given Length objects the power to compare themselves. Remember, sorting depends on comparison. So, having this power, we can do sorting on length objects.

The only functions added to the code above for the Length class is the __str__() special method. This gives us the ability to print out the values of the objects, as well as the sorted function.

So, I encourage you to use this power with responsibility. Python gives you lots of power to do all sorts of things with your objects, even to compare and sort to your desire.

No comments:

Post a Comment

Your comments here!

Matched content