Search

How To Split A String In Python: Python Split() Method

Very often we have a long string and we want to split it. As programmers we encounter this situation often. Python has a method built in to the string data type, the python string split method, that can be used to split strings conveniently. In this post, I will describe how to use the python string split method to split strings and also the different ways you can split a string. I also describe how you can create a dictionary from a split string.

python string split

 

What is the python string split method?

The python string split method is built into the string class and has the syntax str.split(separator=None, maxsplit=-1). What the method does is take a string, represented by the name, str, and then split it based on the separator specified in the arguments. If no separator is specified, it defaults to white space. It then returns a list of the elements of the string as items of the list. The maxsplit argument specifies how many splitting has to be done.

We will cover examples for all the scenarios above shortly.

Here is how it works in practice without any argument specified.

In the example above, I did not specify any argument so it defaulted to splitting the string, string, using whitespace as the delimiter and then splitting all the available white spaces.

Do you know? You can split the string and then join them back again to get back your string? Here is an example where I joined them using the dash character.

Now, let’s discuss each of the argumernts to the python string split method.

The separator argument.

As I said above, if separator is not given, the string is split based on whitespace characters. But if separator arguments are provided, the string is split based on the separator provided in the argument.

In the code below, the comma character is the separator.

Notice that the comma character delimits each of the strings and stores them into a new list using the python string split method.

In the example below the ‘#’ character is the separator used.

If we have a string specifying an email address, we could use the ‘@’ character as the separator.

This gives you an idea of how the string split method works with the separator. Note that when you split an empty string with this method, it will result in a list with the empty string as item.

The maxsplit argument.

The maxsplit argument specifies the maximum number of splits that has to be done on the string.

The default is -1 which means there is no limit to the number of splits. When the maxsplit argument is specified, the result list has maxsplit plus one items i.e if the maxsplit is 2, the items in the resultant list are 3.

Now let’s use examples to show how the maxsplit argument works.

The code above specifies a maxsplit of 2 i.e split the string according to whitespace character twice. Notice that the number of items in the resultant list is 3. It only splits using two white space characters and leaves the remaining white space untouched.

If maxsplit is not specified, the default of -1 is used by the method which signifies split the maximum number of times. Now let’s use the same example but not specifying it.

You will notice that the string is now split the maximum number of times i.e by all the whitespace.

Now that you know how the arguments work, go experiment with them and play with python.

There is a tweak I want to show you. How to build a dictionary using the python string split method.

How to make a dictionary using the python string split method.

Very often we want to be able to use the items in a string to make a dictionary. This is simply done by casting the returned key and value pairs to a dictionary. Consider the example below:

If you read the code you will notice that I first split by the semi-colon character, ;, which separates each of the key-value pair. Then for each key-value pair in the resulting list, I split by the equal sign, =, and cast the result into a dictionary to get my dictionary data structure. Just beautiful.

Now, you have the tricks and treats on how to use the python string split method. Go use it with pleasure.

Happy pythoning.

Python Regex For Mobile Phone Number Validity Check

Because there was a huge response from readers towards the email validity check and Roman numerals validity check algorithms I wrote using python regex, I decided to write another common place validity check – mobile phone numbers validity check. Checking mobile phone numbers for validity promises to be easier than other validity checks.

python regex mobile phone validity check

 

To understand the code I will be using, I recommend you read the following earlier blog posts that discusses python regex syntax and methods: “How to find a match when you are dating floats" which is an introduction to python regex coached in story form, and “The Big Advantage of Understanding Python Regex Methods” which discusses three methods you will always use when looking for matches in python regex. At least, you will use one of the three.

Now that we have the fundamentals out of the way, let’s start coding.

Now the rule I will be using for valid mobile numbers are that: A valid mobile number is a ten digit number starting with a 7, 8, or 9. Simply that. I know you can do it on your own after reading the two earlier posts above. I know you can.

But would you like to see my implementation? Here it is below with explanations. The embedded python interpreter would ask you to input a mobile number that would be used for validation.

The really interesting part I think I should explain is the pattern. Other parts of the code should be clear to you but if they are not clear, check the links above. Now for the pattern (see line 3) I first stated that it starts with either a 7, 8, or 9 using a set notation for python regex, ^[789]. Then after the starting digits, there should be 9 digits after that and only 9 digits after, nothing more nothing less. This notation nails it: \d{d}$ with the $ signifying that the last digit is the end of the pattern. That’s that.

Note: In mobile numbers there are other rules like adding a + before the number or an 0. For the sake of simplicity in this post and helping you to get a hold on the fundamentals, I restricted my pattern to only the rule that it is a ten digit number starting with 7, 8, or 9. If you want to validate for additional rules, then experiment with python regex and send me your code about what you did. I would be happy to take a look at it. Remember, programming is all about being creative in problem solving.

Happy pythoning.

Matched content