Something natural, irrational and very important

Can you guess what’s natural, irrational and very important?

One of many guesses could be life or love, I’m not saying they aren’t. However, there is something else that might fit into that description as well. A hint, it’s one of the fundamental constants in math; \(\pi\) (pi)! Do you remember the value it?

It is said that one of the earliest written approximations of \(\pi\) were found in Egypt and Babylon, both within one percent of the true value (we’re talking about 1900-1600 BC, crazy!). However, the first recorded algorithm for calculating the value of \(\pi\) in a thorough and careful way was a geometrical approach using polygons, developed around 250 BC by the Greek mathematician Archimedes. (A really amazing guy be the way, read more about him here)

There have been many suggestions to how to estimate the value of \(\pi\) throughout history, but how could we estimate the value with today’s tools? In this post, I’ll show you how you can estimate the value of \(\pi\) without drawing a single circle. How? The answer is Monte Carlo Methods!

Let us start with a formula that involves \(\pi\), which might be the area of a circle. We know that the area of a circle equals to:

\[ A = \pi r^2 \]

and with some algebraic rearrangement, we can express \(\pi\) as

\[ \pi = \dfrac{A}{r^2} \]

but how can we solve this equation without knowing the area of the circle?

This is were Monte Carlo Methods comes into use. The main idea behind these methods is to obtain numerical results using random sampling. So, if we assume that we have a unit circle (radius equals to 1) within a square with sides equal to 2, we know that the probability of a random point (where both x and y are between -1 and 1) laying inside the unit circle is given as the proportion between the area of the unit circle and the square, which can be expressed as

\[ P(x^2 + y^2 \le 1^2) = \dfrac{A_{\text{circle}}}{A_{\text{square}}} = \dfrac{\pi}{4} \]

In other words, if we draw a point B times and M of those times the point lies inside the unit circle, the probability of that a random point lies inside the unit circle is given as:

\[ P(x^2 + y^2 \le 1^2) = \dfrac{M}{B} \] Given this, we know that the area of a circle on average will equal to:

\[ A_{\text{circle}} = A_{\text{square}} * \dfrac{M}{B} \]

or

\[ A_{\text{circle}} = 4 * \dfrac{M}{B} \]

..if we consider the case of the unit circle within the square with sides equal to 2. Using all of this knowledge we know that the value of \(\pi\) will on average be:

\[ \pi = \dfrac{A_{\text{circle}}}{r^2} = \dfrac{4 \dfrac{M}{B}}{1^2} = 4 \dfrac{M}{B} \]

Let us give this a try! The figure below shows all the points within the circle (blue ones) and all the points outside the circle (gray ones). As you can see, we’ve created a beautiful circle by just drawing random samples.

Similarly, this can be applied to a 3 dimensional space as well. (Go ahead play around with the 3d figure, it’s interactive).

Now, let us do this a thousand times and each time drawing 200 random points. From this procedure, we obtain the distribution below and according to the distribution \(\pi\) is roughly 3.14…

So there you have it, a way of estimating the value of pi with today’s tools. I’ll provide the R code used for this post here below, feel free to use it.

# The functions 

getSample <- function(dim){
  
  if(dim == 2){
      out <- c(runif(n = 1, min = -1, max = 1),
           runif(n = 1, min = -1, max = 1))
  } else {
    out <- c(runif(n = 1, min = -1, max = 1),
           runif(n = 1, min = -1, max = 1),
           runif(n = 1, min = -1, max = 1))
  }
  
  
  return(out)
}

isItAccepted <- function(input){
  
  if(length(input) == 3){
    c <- sqrt(input[1]^2 + input[2]^2 + input[3]^2)
  } else {
    c <- sqrt(input[1]^2 + input[2]^2)
  }
  
  
  out <- ifelse(c <= 1, TRUE, FALSE)
  return(out)
}



calcPi <- function(n, dim = 2){
  
  if(dim == 2){
    out <- data.frame(x = vector("numeric", length = n),
                      y = vector("numeric", length = n),
                      accept = vector("logical", length = n),
                      stringsAsFactors = FALSE)
  } else {
    out <- data.frame(x = vector("numeric", length = n),
                      y = vector("numeric", length = n),
                      z = vector("numeric", length = n),
                      accept = vector("logical", length = n),
                      stringsAsFactors = FALSE)
  }
  

  
  for(i in 1:n){
    point_temp <- getSample(dim = dim)
    accepted_temp <- isItAccepted(point_temp)
    
    out[i,1:dim] <- point_temp
    out[i,(dim+1)] <- accepted_temp
  }
  
  return(out)
}



# -------------



# 2 Dimensions
set.seed(12345)
data_MC1 <- calcPi(n = 5000, dim = 2)

library(ggplot2)
ggplot(data = data_MC1) +
  geom_point(aes(x = x, y = y, col = accept), show.legend = FALSE) +
  xlab("") + ylab("") +
  geom_vline(xintercept = 0) +
  geom_hline(yintercept = 0) +
  scale_color_manual(values = c("gray70", "blue")) +
  theme_bw()
  



# 3 Dimensions
data_MC2 <- calcPi(n = 10000, dim = 3)
library(plotly)

plot_ly(data_MC2,
        type = "scatter3d",
        x = ~x,
        y = ~y,
        z = ~z,
        mode = "markers",
        marker = list(size = 2),
        color = ~accept,
        colors = c("gray87","blue"),
        showlegend = FALSE) 



# Drawing many samples
piEstimates <- vapply(seq_along(1:1000),
                      function(i){
                        4*mean(calcPi(200)$accept)
                      }, FUN.VALUE = numeric(1))





# Plotting the distribution
qplot(piEstimates,
      y = ..density..,
      geom = "histogram", 
      bins = 40, 
      fill = I("white"),
      col = I("black")) + 
  geom_density(col = "red") +
  geom_vline(xintercept = mean(piEstimates), lty = 2) +
  xlab(expression(pi)) + ylab("") + 
  scale_x_continuous(breaks = seq(2.5,4, 0.1)) +
  scale_y_continuous(breaks = NULL) +
  theme_bw() 
Arian Barakat

Arian Barakat

rss facebook twitter github youtube mail spotify instagram linkedin google google-plus pinterest medium vimeo stackoverflow reddit quora