Search

Complete Methods For Python List Copy

After my post on python shallow and deep copy, a reader asked me: you can also copy a list with list slicing and it is fast. Which should I use?

Well, I decided to dedicate a post on the numerous ways you can copy a python list and then evaluate their timing to find out which is the fastest in order to give a concise answer.

python list copy

 

So, here are the different methods one after the other.

1. The built-in python list.copy method

This method is built-in for sequences and a list is a sequence. Because it is built-in, I guarantee you that like everything python built, it should be fast. I just love using this method whenever I need to copy a list. But let the timing will tell us the most efficient.

An example of how you can use it to copy a list is:

As you can see from the code names2 was independent of names after copying. So, it gives desired behavior. But I need to tell you a caveat. List.copy() does a shallow copy; it cannot recursively copy nested lists. Too bad.

2. Slicing the entire list

Yes, I said it again. When you slice the entire list, you eventually copy everything into another object. This method is so cool. The syntax is listname[:]. That’s all you need to do to copy.

Let’s try this with an example.

Yes, it is extremely convenient. It worked just as we expected, producing an independent list as output even when the original was changed. Like the first method, this method of slicing to copy python lists is shallow copy also. Too bad.

3. Using the built-in list constructor, list()

This is just like creating a list from another list. The syntax is list(originallist). It returns a new object, a list.

Here is an example.

4. Use the generic shallow copy method.

For the generic shallow copy method, you need to import the copy module: import copy. Then call the copy method of the module on the original list: copy.copy(originalist). I talked all about how to do this in the post on python shallow copy and deep copy. You can reference it for a refresher.

Here is an example.

So, as we expected. The returned list, names2, was independent of the original list, names. But as the name says, it does shallow copy. That means it cannot copy recursively. Like where we have a nested list, it cannot copy deep down but returns a reference to the nested items.

5. The generic deep copy method

This is the last and the method I use whenever I have lists in my custom classes and need to copy them. This method copies deep down, even to the nested items in a nested list. It is also a method of the copy module. You can read all about it in the link I gave above.

Let’s do an example.

I really need to do one more example with this method, to show that it copies deep down even to nested lists.

As you can see from the above nested list, when we change one of the nested items in the original list, the copy did not reflect that change to show that it was not copying by reference but copying deep down to the values.

Now that you are familiar with all the different ways to copy a list, which is the most time efficient?

First, I will have to tell you that if you have a nested list or an array, the only method you can use is the python deep copy method. That is the only method that copies everything in the nested list or array without leaving any references.

Now, for the other types of lists, that is, lists that are not nested, all the methods can be used so we will now try to find out which is more time efficient by timing their processes.

Which method is more time efficient?

To test it out, you have to run the code below and see for yourself.

You will notice that the built-in python list copy method was approximately faster than all the other methods of copying a list. That’s why I love using any function or method that is built-in specifically for any data type or data structure. But list slicing comes at a close second place. Although I would not want to use list slicing if I have a very large list.

That’s it. I hope you did enjoy this post. Watch out for other interesting posts. Just update via your email and they will surely land right in your inbox.

Happy pythoning.

Visualizing ‘Regression To The Mean’ In Python

Let’s take a philosophical bent to our programming and consider something related to research. I decided to consider regression to the mean because I have found that topic fascinating.

regression to the mean python

 

What is regression to the mean?

Regression to the mean, or sometimes called reversion towards the mean, is a phenomenon in which if the sample point of a random variable is extreme or close to an outlier, a future point will be close to the mean or average on further measurements. Note that the variable under measure has to be random for this effect to play out and to be significant.

Sir Francis Galton first described this phenomenon when he was observing hereditary stature in his book: “Regression towards mediocrity in hereditary stature.” He observed that parents who were taller than average in the community tend to give birth to children who became shorter or close to the community average height.

Since then, this phenomenon has been described in other fields of life where randomness or luck is also a factor.

For example, if a business has a highly profitable quarter in one year, in the next coming quarter it is likely not to do as well. If one medical trial suggests that a particular drug or treatment is outperforming all other treatments for a condition, then in a second trial it is more likely that the outperforming drug or treatment will perform closer to the mean the next quarter.

But the regression to the mean should not be confused with the gambler’s fallacy that states that if an event occurs more frequently than normal in the past, then in the future it is less likely to happen even where it has been established that in such events the past does not determine the future i.e they are independent.

I was thinking about regression to the mean while coding some challenge that involved tossing heads and tails along with calculating their probability, so I decided to add a post on this phenomenon.

This is the gist of what we are looking for in the code. Suppose we have a coin that we flip a set number of times and find the average of those times. Then we aggregate the flips for several trials. For each trial, we look for the averages that were extremes and find out if the average flip after that extreme regressed towards the mean. Note that the mean of the flip of a coin is 0.5 because the probability that a fair coin will come heads is ½ and the probability it will come tails is also ½.

So after collecting the extremes along with the trial that comes after it, we will want to see if the trials were regressing towards the mean or not. We do this visually by plotting a graph of the extremes and the trials after the extremes.

So, here is the complete code. I will explain the graph that accompanies the code after you run it and then provide a detailed explanation of the code by lines.

After you run the above code, you will get a graph that looks like that below.

regression to mean python


We drew a line across the 0.5 mark on the y-axis that shows when the points cross the average line. From the graph you will see rightly that for several occasions, when there are extremes above or below the average line, the next trial results in an flip that moved towards the mean line except for one occasion when it did not. So, what is happening here? Because the coin flip is a random event, it has the tendency to exhibit this phenomenon.

Now, let me explain the code I used to draw the visuals. There are two functions here, one that acts as the coin flip function and the other to collect the extremes and subsequent trials.

First, the code for the coin flip.

    
def flip(num_flips):
    ''' assumes num_flips a positive int '''
    heads = 0
    for _ in range(num_flips):
        if random.choice(('H', 'T')) == 'H':
            heads += 1
    return heads/num_flips

The function, flip, takes as argument a specified number of flips that the coin should be tossed. Then for each flip which is done randomly, it finds out if the outcome was a head or a tail. If it is a head, it adds this to the heads variable and finally returns the average of all the flips.

Then the next function, regress_to_mean.

    
def regress_to_mean(num_flips, num_trials):
    # get fractions of heads for each trial of num_flips
    frac_heads = []
    for _ in range(num_trials):
        frac_heads.append(flip(num_flips))
    # find trials with extreme results and for each 
    # store it and the next trial
    extremes, next_trial = [], []
    for i in range(len(frac_heads) - 1):
        if frac_heads[i] < 0.33 or frac_heads[i] > 0.66:
            extremes.append(frac_heads[i])
            next_trial.append(frac_heads[i+1])
    # plot results 
    plt.plot(extremes, 'ko', label = 'Extremes')
    plt.plot(next_trial, 'k^', label = 'Next Trial')
    plt.axhline(0.5)
    plt.ylim(0,1)
    plt.xlim(-1, len(extremes) + 1)
    plt.xlabel('Extremes example and next trial')
    plt.ylabel('Fraction Heads')
    plt.title('Regression to the mean')
    plt.legend(loc='best')
    plt.savefig('regressmean.png')
    plt.show()

This function is the heart of the code. It flips the coin a set number of times for a set number of trials, accumulating each average for each trial in a list. Then later, it finds out which of the averages is an extreme or outlier. When it gets an outlier, it adds it to the extremes list, and then adds the next trial to the next_trial list. Finally, we used matplotlib to draw the visuals. The visuals is a plot of the extremes and next_trial figures with a horizontal line showing the average line for the viewer to better understand what direction the next trial is expected to move to when there is an extreme.

I hope you sure enjoyed the code. You can run it on your machine or download it to study it, regress_to_mean.py.

Thanks for your time. I hope you do leave a comment.

Happy pythoning.

Matched content