Search

Utilizing Python reduce and accumulate functions as accumulators

Accumulators have a notable reputation in computing history. The earliest machines by Gottfried Leibniz and Blaise Pascal were based on the concept of accumulators. If you are familiar with your python functions, you would know that the python sum function acts as an accumulator when it comes to addition. But I would like to explain two functions in this post that you can use as accumulators for any operation. These functions are the python reduce function from the functools module and the python accumulate function from the itertools module.

python reduce and accumulate functions
 

The basic function of these two functions is that they take a function and an iterable as arguments, and sometimes an initializer, and then successively carry out the operations of the function on two items in the iterable at a time, storing the result in a variable, and then doing the operation on the next item, storing the result and so on and so forth until you get to the final item and then output the final result. They have different ways of working though, which I will explain.

First, I will start with the python reduce function.

The python reduce function.

The syntax of the python reduce function is functools.reduce(function, iterable[, initializer]) and what it does is to apply the function to the items of the iterable from left to right, and it eventually reduces the iterable to a single value. The function returns the accumulated value of the result returned by the operation of the function that serves as its argument. The function must take only two arguments. The initializer parameter is optional. I will explain it below.

Let’s take the simplest accumulator, the sum function using a lambda function, and see how we can use it to illustrate how the reduce function works.

What the reduce function does is that it is using the lambda function to sum up the items of the iterable, this time, a list. First, it takes 1, the first item and binds it to x, then 2 and binds it to y, then it adds x + y, i.e 1 + 2 and binds the result, 3, to x again. Next it takes 3, and binds it to y, and adds x + y which this time is 3 + 3 which equals 6 and then it gives you the total result. So, you can now visualize how the successive addition is carried out.

You can click the following links if you want a refresher on python lambda functions or on python iterables.

As you can see from the syntax above, sometimes you can supply an initializer to the reduce function. The initializer takes the first value when the function is called. And if the iterable is empty, the initializer will serve as the default.

Let’s use an example with an initializer and see how it runs. This time we want our initializer to be 4.

You can see from the example above that the result of the summation of the list becomes 10. This is because we used an initializer of 4. What is happening here is that when reduce runs, it first binds the initializer to x, therefore x becomes 4. Then it binds y to 1 and sums them to give 5 and binds this result to x. It then binds y to 2 in the list, the next item, sums x and y to give 7 and binds this result to x. it then binds 3, the next item in the list to y and then sums x and y to give 10, the final result. It then returns 10.

Simple, not so. Very easy and fascinating. But don’t be in a hurry. It gets more fascinating when you realize that you can carry out operations on just anything you want. I used sum function to make you get acquainted with this. Any program that needs to accumulate successive results can be used with the reduce function.

Let’s take an interesting amortization example. If I owe $1000 and I pay off $100 annually at an interest rate of 5%, how much would I be owing at the end of four years? Reduce can help you get the result quickly. Let’s see how.

From the code above, you can see that I used an initializer of 1000 and the iterable was a list with the regular payments as the items.

Now, the question comes - since accumulators store successive results before giving the total result, can we be able to get those results before the total? Yes, we can. That is when the python accumulate function from the itertools module comes in.

The python accumulate function.

In fact, you can say that the python reduce and accumulate functions are cousins except for one difference: python accumulate gives you the ability to get the result of successive operations instead of having to wait for the final result. It acts like a generator in this instance.

The syntax of the python accumulate function is: itertools.accumulate(iterable[, func, *, initial=None]). As you can see from the syntax, the python accumulate function uses an iterable to create an iterator and applies the function to each of the elements in the iterator. That is what gives it the behavior of a generator. To get refreshers on these two concepts, you can check out this post on iterators, and also this post on generators. Just like for the python reduce function, the function used in the python accumulate method should be a function that accepts only two items and operates on these two items. Python accumulate method also takes an initializer, the initial argument, which is optional.

So, using the accumulate function, let’s do our amortization again but this time returning the results of successive accumulation instead of waiting for the final or total accumulation.

If you read the code above, you will notice that I cast the iterator returned by the python accumulate function to a list so that I can print out each of the results. Also, one feature of the accumulate function is that it returns the first item in the cashflow list, so during the iteration of the amount owing list, I ignored this first item. Apart from those two notations, we have our results just similar but a little differently from the python reduce function. This time, we can calculate the balance due at the end of each year rather than wait until the end of the fourth year.

If you notice when the yearly balance printed, each of the amounts was to two decimal places. I did that with a nice python string formatting syntax, {amount_owing_list[i]:.2f}, on line 8. To learn how, you can read an earlier post on python string formatting Part 1 and python string formatting Part 2 and you would be sure to be able to do it yourself.

So, that’s it. You can see that python as a language has powerful capabilities. Go experiment with it. Have fun with python.

See you at the next post. If you want to receive new post updates, just subscribe with your email. Happy pythoning.

Using Python Regex To Validate Roman Numerals

Python regex, or sometimes called python regular expressions, are expressions written in python that are made to match a specific pattern in a string. They are a widely used feature in the world of UNIX and is provided by many programming languages. Python is not left out. Some of the advantages of using python regex are that with just one pattern you can validate any kind of input. Something we will be doing in this post. It keeps your code cleaner because it usually involves fewer lines of code, and furthermore saves you the stress of writing numerous lines of if else statements.

python regex with roman numerals
 

If you want a guide to regular expressions in python and some functions that come with the use of python regex, I will encourage you to read it up in this post, that describes the basic syntax, and then this other post on the methods we will be using, the python re match method.

In today’s post, we are going to show how to use python regex to validate Roman numerals based on its rules.

Roman Numerals and Its Rules

Roman numerals are a numeral system that originated in ancient Rome. They were popular and became the usual ways of writing numbers even down to the late middle ages in Europe. The numbers use Latin alphabets to represent numbers and these alphabets are combined according to set rules. In the modern usage of Roman numerals, seven alphabets are used to designate numbers and they are:

Symbol Value
I 1
V 5
X 10
L 50
C 100
D 500
M 1000

Some of the rules for writing valid Roman numerals which we will be using for validation are:

  1. The Roman numerals I, X and C can be repeated up to 3 times in succession to form the numbers but repetition of V, L, or D is invalid.
  2. To form numbers a digit of lower value can be placed before or after the digit of higher value and digits of lower value that can be used for this are I, X, and C.
  3. You should add up all the digits in a group when a digit of lower value is placed after or to the right of a digit of higher value. Digits of similar values placed together are also added.
  4. Subtract the value of lower digit from the value of higher value when a digit of lower value is placed to the left or before a digit of higher value. Note that V is never written to the left of X.

So, now that we have the rules we need to form the python regular expressions, let’s do the Roman numerals validation which is the juicy part.

Validating any Roman numeral

When you run the code below, you need to input a string as a Roman numeral when you are prompted. You will get a result indicating whether the string is a valid Roman numeral or not. If it is an invalid Roman numeral, you will get a message that says: “Invalid Roman Numeral” but if it is valid, you will get a message that says: “Your roman numeral was valid. Welcome.”

Now, let’s run it and have fun. After you have tried running it, I will give a brief explanation of the lines of code. Note that this code takes only 8 lines. If I had needed to use a python if else statement, that would have taken more than that which would not be clean.

Now, that you have taken some time running the above code and seeing how it works, let me explain some of the parts. I think I don’t need to explain the python re match method because you have read it from the link I gave above. So, I will just explain the pattern.

The key to the pattern matching above is the python regex pattern which is denoted as:

regex_pattern = r"^(?=[MDCLXVI])M*(C[MD]|D?C{0,3})(X[CL]|L?X{0,3})(I[XV]|V?I{0,3})$"

The ^ symbol that starts the pattern states that we should start from the beginning of the string while the corresponding $ symbol at the end says that we end at the end of the string. So, we presume that each string passed to the code will only be a single regular expression pattern, otherwise you will get invalid code. Now after the ^ symbol is a lookahead assertion, (?=[MDCLXVI])). Read up this blog post on python lookahead assertions if you want a refresher.

What the python lookahead assertion does is that it says starting at the beginning of the string we want to look ahead and state that any symbol we will be getting must either be an M, D, C, L, X, V or I. Yes, the only symbols that should be allowed to start the string are the seven symbols of the Roman numerals and nothing else. Note that the characters in python lookahead assertion are not captured. So, right now, we have not captured any match.

The next symbol is to match the thousands place. I denote this with the pattern: M*. It states that for the thousands place in the number, we need to match for M either 0 or more times. If the number is not a thousand or multiple of it, then M is zero but if it is then M is 1 or more, so we get a match for this. Unfortunately, I cannot guarantee you that this pattern will match beyond 3999, this is because from 4000, we need a very special thousand Roman numeral symbol to denote this which the pattern cannot cover. But you can try 1999 (MCMXCIX) and see that it matches. Because of the limitation in the thousands place, we could replace M* above with M{0,3} to state that we cannot go beyond 3999.

The next symbol to match is the hundreds place from 100 to 999. I denote the hundreds place with (C[MD]|D?C{0,3}) pattern. What this pattern says is the for a hundred place match, either C (100), should be to the left of M (1000) or D(500), or C should come after an optional D (500), but not more than three consecutive Cs.

The next is the tens place which runs from 10 to 99. The symbol for it is: (X[CL]|L?X{0,3}). This states that the tens place can either be an X (10) before a C (100) or L (50), or it can come after an optional L (50) and if this is the case in not more than 3 consecutive Xs.

The next is the units place which is between 1 and 9. Remember there is no 0 in roman numerals. The symbol for it is: (I[XV]|V?I{0,3}). What the symbol is stating is that the units place is denoted either by an I (1) appearing to the left of an X (10) or V (5), or it appears to the right of an optional V (5) and if that is the case not more than 3 times.

Well, that is it. Enjoy validating your Roman numerals with this simple tool.

I hope you do leave a comment about your results.

Happy pythoning.

Matched content