Search

Validating Credit Cards With Python Regex

We have been exploring python regex in previous posts because it is an interesting area of programming. It gives me the adrenaline surge when I get a pattern right. I usually use regex101.com to confirm my patterns in python regex.

 

python regex

Because I have touched extensively on the syntax of python regex in previous posts, like on the syntax of python regex and the methods that are used in python regex, I will go straight to describing today’s task.

In today’s coding challenge, we want to validate credit card numbers. We know credit cards are 16 digits long, but what goes into their creation? For a credit card number to be valid, it needs to have the following characteristics:

1. It must start with 4, 5, or 6 (assuming that is the requirement for our bank MZ. Other banks have different requirements but we can scale)

2. It must contain exactly 16 digits.

3. It must only consist of digits, i.e, 0 – 9.

4. It may have digits in groups of 4 separated by a single hyphen, ‘-‘.

5. It must not use any other separator except the hyphen.

6. It must not have 4 or more consecutive repeated digits.

Now, that list of requirements is long. Yes, it is long. But we can scale to the requirements. Here is code that does just that. I would like you to read the code and then run it to see the output. Later, I will explain each of the regex patterns and what the code is doing.

Now that you have read it and run it, I sure hope you understand the code. Not to worry, I am here to walk you through the lines of the code. But first, let me explain some relevant details in regex that will help you understand the code. That is, the python regex meta characters that would be of help.

? This is called the optional match. This tells python to match 0 or 1 repetitions of the preceding regular expression.
{m} This tells python to match exactly m repetitions of the preceding regular expression.
[abc] Used to indicate a set of characters. Any character within the set is matched. This example matches either a, b, or c that are within the set.
(…) This matches whatever regular expression is inside the parenthesis and indicates the start and end of a group. The contents of the group can be retrieved after the match has been performed and can be matched later in the string with the \number special sequence.
\number Matches the contents of the group with this number. Group numbering starts from 1.
(?!...) matches if … doesn’t match next. This is the negative lookahead assertion. For example Isaac(?!Asimov) will match Isaac only if it is not followed by Asimov.

Now that I have explained all the relevant meta characters you need to understand the code, let’s go through the code, starting first with the patterns for the match.

On lines 4 and 5, you can see that I wrote two patterns we will be using to do the matches.

Line 4, the valid_structure pattern: r"[456]\d{3}(-?\d{4}){3}$”. First it indicates a set of characters to start off the pattern, [456]. That means the start of the pattern should be a 4, 5, or 6 based on our credit card characteristic. Then this should be followed by exactly 3 digits. \d indicates digits. After this we have a group which consists of an optional hyphen, -, and then exactly 4 digits. This group should be matched three times. When we calculate the digits together, this goes to 16. So, with the valid_structure pattern, we have satisfied nearly all the requirements, except the requirement that there should be no 4 consecutive repeats of any digit.

That is where no_four_repeats pattern comes in, on line 5. The pattern is r"((\d)-?(?!(-?\2){3})){16}". Now let’s go through it. First we did a global grouping. This is group 1. Then we grouped the digit at the start of the pattern; it will become group 2. What the pattern is saying is that a digit could be followed by a hyphen. Then we did a negative lookahead assertion in a group. In the negative lookahead assertion we said that the group should not include an additional hyphen and other digits exactly three times. What we are saying is that the grouping in the negative lookahead assertion should not exist in the string and if it does, there is no match. Then all the digits in the group should be exactly 16 digits. If you want a refresher on negative lookahead assertions, you can check this blog post on assertions in regular expressions.

The rest of the function following from the patterns is self explanatory. We pack the patterns into a tuple in line 6. From lines 8 to 12 we search for a match based on the patterns and the list of credit cards that was passed.

I hope you did find the code interesting. It was beautiful matching credit card numbers. I hope to bring you more like this if I find any challenge that seems interesting.

If you would like to receive instant updates when I post new blog articles, subscribe to my blog by email.

Happy pythoning.

Matched content