Deterministic processes are what every programmer is familiar with when starting out in their journey. In fact, most beginner books on programmer will teach you deterministic processes. These are processes where for an input, you always get the same output. But when you get into industry, you find out that most times stochastic processes are the norm when finding solutions to problems. Stochastic processes give different results for the same input.
In this post, I will be simulating a stochastic process, a drunkard’s walk, which is an example of a random walk.
Python random walks are interesting simulation models for the following reasons:
- They are widely used in industry and interesting to study.
- They show us how simulation works in practice and can be used to demonstrate how to structure abstract data types.
- They usually involve producing plots which are interesting and as they say, a picture is worth a thousand words.
So, let’s go to the simulation exercise. It is interesting to find out how much distance a drunk would have made from his starting position if he takes a number of steps within a given space of time. Would he have moved farther after that time, would he still be close to the origin, or where would the drunk be? Such questions can only be simulated for us to get a general idea of the drunk’s position. We’ll imagine that for each movement, the drunk can take one step either in the north, south, east, or west direction. That means he has four choices to choose from for each step.
To model the drunk’s walk after some time, we will be using three classes representing objects that define his position relative to the origin: Location, Field, and Drunk classes.
The Location class defines his location relative to the origin. We could write code for the class this way:
class Location(object):
def __init__(self, x, y):
''' x and y are numbers '''
self.x, self.y = x, y
def move(self, delta_x, delta_y):
''' delta_x and delta_y are numbers '''
return Location(self.x + delta_x, self.y + delta_y)
def get_x(self):
return self.x
def get_y(self):
return self.y
def dist_from(self, other):
ox, oy = other.x, other.y
x_dist, y_dist = self.x - ox, self.y - oy
return (x_dist**2 + y_dist**2)**0.5
def __str__(self):
return '<' + str(self.x) + ', ' + str(self.y) + '>'
Each location has an x and y coordinate representing the x and y-axis. When the drunk moves and changes his location, we could return a new Location object to signify this. Also using the location class, we can calculate the distance of the drunk from another location, and most possibly the origin.
The second class we need to define is the Field class. This class will allow us to add multiple drunks to the same location. It is a mapping of drunks to their locations. Code could be written this way for it:
class Field(object):
def __init__(self):
self.drunks = {}
def add_drunk(self, drunk, loc):
if drunk in self.drunks:
raise ValueError('Duplicate drunk')
else:
self.drunks[drunk] = loc
def move_drunk(self, drunk):
if drunk not in self.drunks:
raise ValueError('Drunk not in field')
x_dist, y_dist = drunk.take_step()
current_location = self.drunks[drunk]
# use move method of Location to get new location
self.drunks[drunk] =
current_location.move(x_dist, y_dist)
def get_loc(self, drunk):
if drunk not in self.drunks:
raise ValueError('Drunk not in field')
return self.drunks[drunk]
As you can see, the Field class is a mapping of drunks to locations. When we move a drunk, his location reflects this move and we take note of the current location. Also, we can use this class to find out the location of any drunk.
The last class of interest is the Drunk class. The Drunk class embodies all the drunks we will be playing with. It is a common class or parent class as all other drunks will inherit from this class.
import random
class Drunk(object):
def __init__(self, name=None):
''' Assumes name is a string '''
self.name = name
def __str__(self):
if self != None:
return self.name
return 'Anonymous'
What the Drunk class does is give identity to each drunk object or subclass.
Now, we will create a drunk with our expected way of movement: that is take one step each time in the north, south, east, or west direction. We will call this drunk class, UsualDrunk. Here is the definition of the class.
class UsualDrunk(Drunk):
def take_step(self):
step_choices = [(0,1), (0, -1), (1,0), (-1,0)]
return random.choice(step_choices)
The UsualDrunk class inherits from the Drunk class and the only method it defines is the random step it can take. From the take_step method you can see that it can only move one step to the east, west, north or south, and this in a randomized fashion.
So, now that we have our classes let us try to answer the question – where will the drunk be after taking a series of walks in a random fashion? Like taking 10 walks, or 100, or 1000? Normally, we would expect that when the number of walks increases, the distance from the origin should increase. But this might not be the case because you know how drunks walk – haphazardly. Some drunks can even retrace their steps back to where they started and go nowhere!
So, for our simulation, we will write code that makes use of these classes and run the code on the drunk taking a number of steps with different trials for each step. We are using different trials in order to balance out the randomized walk and get a mean of distances.
Here is the code:
When you run it there is one fact that stands out: The mean distance from the origin increases as the number of steps increases. That is the hypothesis we started with.
Some pertinent new driver code are the following:
def walk(f, d, num_steps):
'''Assumes: f a field, d a drunk in f,
and num_steps an int >= 0.
Moves d num_steps times; returns the distance between
the final location and the location at the start
of the walk.'''
start = f.get_loc(d)
for _ in range(num_steps):
f.move_drunk(d)
return start.dist_from(f.get_loc(d))
The walk function returns the distance from the final location for a single trial based on the drunk taking a number of steps that is defined.
def sim_walks(num_steps, num_trials, d_class):
'''Assumes num_steps an int >= 0, num_trials an int > 0,
d_class a subclass of Drunk.
Simulates num_trials walks of num_steps steps each.
Returns a list of the final distance for each trial'''
homer = d_class()
origin = Location(0,0)
distance = []
for _ in range(num_trials):
f = Field()
f.add_drunk(homer, origin)
distance.append(round(walk(f, homer, num_steps), 1))
return distance
The sim_walks function (simulated walks) is different from the walk function only in one aspect: it relates to all the different trials that are used for a specific step. Say for a 10 steps walk we did 100 trails so as to get the mean. So sim_walk returns a list of the distances for the trials. This is so that we can take the mean distance for each number of steps since we are randomizing the walk.
And finally, the drunk_test function.
def drunk_test(walk_lengths, num_trials, d_class):
'''Assumes walk_lengths a sequence of ints >= 0
num_trials an int > 0, d_class a subclass of Drunk
for each number of steps in walk_lengths, runs sim_walk
with num_trials walks and prints results '''
for num_steps in walk_lengths:
distances = sim_walks(num_steps, num_trials, d_class)
print(d_class.__name__, 'random walk of ', num_steps, 'steps')
print('Mean:', round(sum(distances)/len(distances), 4))
print('Max:', max(distances), 'Min:', min(distances))
This serves as the test of our code. It prints out the mean for each number of steps after doing the various trials and then the max and min for those trials in a specific step.
You could download the above code here, random_walk.py.
But a picture is worth a thousand words. Let us use a plotted graph to illustrate the variation in the number of steps to the distance from the origin.
You can see from the graph above that when the drunk is taking ten steps for each of the 100 trials, the distance he moves is closer to the origin than when he takes 100 or 1000 steps. But the drunk seems more determined to walk farther away if he is given the opportunity to take several steps. Drunks really mean to get home it seems! A graph of number of steps for each trial to mean distances shows that this is truly the case: the more opportunity he is given to take higher steps, the closer he gets to home and away from where he started with. The graph below shows that information.
The scales in the graph have been extrapolated to logarithmic scales to clearly show the straight line relationship between number of steps and mean distance from the starting point. To see how the code for the plotted graphs were written you can download it here, random_walk_mpl.py.
Now, our simulation has dwelt on a drunk walking the way we expect: for each step one unit towards the east, west, north, or south.
What if we could make the drunkard’s walk somewhat biased by skewing it a little. That would involve creating different drunks with different steps and comparing them to our usual drunk.
A biased random walk simulation.
Let’s imagine a drunkard who hates the cold and moves twice as fast in a southward direction. We could make him a subclass of Drunk class and change his way of movement in the class, calling him ColdDrunk.
This could be his class definition:
class ColdDrunk(Drunk):
def take_step(self):
step_choices = [(0.0, 1.0), (0.0, -2.0),
(1.0, 0.0), (-1.0, 0.0)]
return random.choice(step_choices)
You can see that whenever he moves southwards, y axis, he takes two times a unit step.
Now let’s also add another hypothetical drunk that moves only in the east-west direction. He really moves with the sun or is phototrophic. We could define his class, EWDrunk, in the following way:
class EWDrunk(Drunk):
def take_step(self):
step_choices = [(1.0, 0.0), (-1.0, 0.0)]
return random.choice(step_choices)
So, we have all our drunks ready. Now let’s write code that will run them and compare their mean distances for various number of steps.
If you run the code above you will get a plotted graph that shows number of steps against mean distance from the origin for the three drunks. You will get a graph that looks like the following:
You will notice that for both the UsualDrunk, who we highlighted earlier, and the phototrophic drunk, EWDrunk, their variation in mean distance as the number of steps increases is not much compared to the South loving or north hating drunk, ColdDrunk. That means the ColdDrunk, or north hating drunk, is moving faster than all other drunks. This is not surprising based on the fact that whenever he moves south, he moves twice as fast. That means randomly the drunk’s movement is more favorable than the other two.
We could extrapolate on this conjecture and build a scatter plot of the location of each drunk’s movement for each step but I think the point has already been made: simulating a random walk could give us insights into a model and could confirm or deny a hypothesis.
If you would like a copy of the code for the three drunks, you can download it here, random_walk_biased.py.
That’s it folks. I hope you enjoyed this post. I really enjoyed coding it. It was fun.
This helps us to see the insight that plotting a class or set of classes can give to a programmer.
Happy pythoning.
No comments:
Post a Comment
Your comments here!