Monte Carlo Methods in Python
In this notebook, I use a Monte Carlo method to estimate Pi, and then to create a simple model of disease spread based on random simulation.
Estimating Pi
How can we estimate Pi with random numbers?
We know that the area of a unit circle/area of a unit square = π/4
So in the below diagram, the probability of a randomly selected point (or 'dart') within the blue square being also inside the red circle is: P(hit) = π/4
and as P(hit) = n(hit)/n(darts),
π = 4*n(hit)/n(darts)
Lets calculate this via an easy to follow brute force method:
Now more efficiently with numpy arrays:
Let's make a function to initialise a dataframe for n darts:
And plot the darts for a visual representation:
Finally, lets show how the accuracy improves with the number of guesses:
SIR model of Disease Spread
The Susceptible-Infected-Recovered (SIR) model for spread of disease simulates a population by placing all individuals into three classes at any one time: susceptible, infected, or recovered.
The model parameters are the infection rate and the recovery rate of the disease (sometimes called beta and gamma respectively). beta/gamma gives the basic reproductive number, R0, which is the average number of secondary infections caused by an infected host.
In the simple approach below, we assume that reinfection is not possible, and recovered individuals are classified as immune. We start with 1% of the population infected.
Let's briefly compare the various inputs.
As one might expect, increasing R0 (by increasing infection rate or decreasing recovery rate) makes the infection peak faster and higher:
Increasing the size of the initially immune population slows the spread of disease, as it increases the amount of unsuccessful transmissions/collisions (decreasing the effective R rate):
Finally, as is typical with Monte Carlo approaches, the behaviour tends towards the 'expected' values as the number of repeats increases:
Limitations
This is a simple model with a number of limitations. For example, it doesn't take into account proximity or movement of population. It also doesn't consider that some individuals may transmit the disease to more people than others, instead just using a single 'average' value. Where real diseases have windows in which they can be passed on, in this model anyone infected can pass it to anyone susceptible, at any time. We have also assumed that reinfection is not possible, which is generally not the case. If we allow for reinfection through a subtle change in the model, the variables will tend towards equilibrium values (endemic).