library(tidyverse)
library(ggforce)

1. Monte Carlo experiments

A broad class of computational algorithms that rely on repeated random sampling to obtain numerical results.
The underlying concept is to use randomness to solve problems that might be deterministic in principle.
They are often used in physical and mathematical problems and are most useful when it is difficult or impossible to use other approaches.
Monte Carlo methods are mainly used in the following three problem classes:

optimization
numerical integration
generating draws from a probability distribution

1.1 Generating random number: `sample()`

Two ways of generating random number

Generate random number by drawing values from a group of numbers
Generate random number by drawing values from probability distribution, such as normal distribution

(1) Generate random number by drawing values from a group of numbers

How to use sample() sample(x = a set vector,
　　　size = the number of trials,
　　　replace = replace the number you draw or not
　　　prob = probability of each number drawn)

For instance, suppose x is a set vector containing the value from 1 to 10

x <- c(1, 2, 3, 4, 5, 6)

Randomly draw 5 numbers from ｘ

sample(x,
       size = 5)

[1] 3 5 6 1 2

In this case, the number you draw, for example you draw number 1, number 1 is not drawn in the trial
But, if you add replace = TRUE, which means extraction with restoration, then it is likely that you select number 1 in the trial

sample(x,
       size = 5, 
       replace = TRUE)

[1] 6 2 2 6 2

prob means the probability that each number is drawn
Since there are 6 numbers, R automatically assigns this probability as one-sixth
Suppose you want to randomly draw the three numbers (4, 5, 6) and these three numbers are twice as likely as the other three numbers (1, 2, 3) to be drawn
If this is the case, then you should use the following R code:

sample(x,
       size = 5, 
       replace = TRUE,
       prob = c(1, 1, 1, 2, 2, 2))

[1] 1 6 4 5 4

You can also use prob = c(1/9, 1/9, 1/, 2/9, 2/9, 2/9) and get the same result
If you want to roll the dice three times, then you should add the number of trials: size = 3
If you roll the dice three times, then you may not have the same number or you may have it multiple times
→ You need to add replace = TRUE, which means extraction with restoration
The probability that you have one of the 6 numbers is one-sixth
You need not to assign this probability
R automatically assign it

sample(1:6,
       3,
       replace = TRUE)

[1] 6 5 5

Suppose you have a loaded dice that shows the odd numbers (1, 3, 5) twice as likely as the even number (2, 4, 6)
If this is the case, then you need to add: prob = c(2, 1, 2, 1, 2, 1)

sample(x,
       size = 5, 
       replace = TRUE,
       prob = c(2, 1, 2, 1, 2, 1))

[1] 1 5 3 5 1

Also suppose you roll this loaded dice 10,000 times and check the number you get
Then visualize the result using a bar chart
Using table(), you can show the result

dice <- sample(1:6, 
               10^4, 
               replace = TRUE,
               prob = c(2, 1, 2, 1, 2, 1))
table(dice)

dice
   1    2    3    4    5    6 
2187 1126 2264 1156 2209 1058

Using barplot(), let’s visualize this result

table(dice) |> 
  barplot()

- You see that the odd numbers (1, 3, 5) is twice as likely as the even numbers (2, 4, 6) to be drawn in this simulation

1.2 Generating random numbers from probability distributios

Function	Probability distribution	Parameters
`rbeta()`	beta distribution	n, shape1, shape2
`rbinom()`	binomial distribution	n, size, prob
`rcauchy()`	cauchy distribution	n, location, scale
`rchisq()`	Chi-square distribution	n, df
`rexp()`	exponential distribution	n, rate
`rf()`	F distribution	n,
`rgamma()`	gamma distribution	n, shape, scale
`rgeom()`	geometric distribution	n, prob
`rhyper()`	hypergeometric distribution	nn, m, n, k
`rlnorm()`	lognormal distribution	n, meanlog, sdlog
`rmultinom()`	multinomial distribution	n, size, prob
`rnbinom()`	negative binomial distribution	n, size, prob
`rnorm()`	normal distribution	n, mean, sd
`rpois()`	Poisson distribution	n, lambda
`rt()`	t distribution	n, df
`runif()`	uniform distribution	n, min, max
`rweibull()`	Weibull distribution	n, shape, scale
`mvtnorm::rmvnorm()`	multivariate normal distribution	n, mean, sigma

1.3 Set seed in R

When you generate “random numbers” in R, you are actually generating pseudorandom numbers
These numbers are generated with an algorithm that requires a seed to initialize
Being pseudorandom instead of pure random means that, if you know the seed and the generator, you can predict (and reproduce) the output
Setting a seed in R means to initialize a pseudorandom number generator
Most of the simulation methods in Statistics require the possibility to generate pseudorandom numbers that mimic the properties of independent generations of a uniform distribution in the interval (0, 1)
If you want to draw a random number in R, you receive different numbers every time you do
Suppose you wan to draw three numbers from a normal distribution with average = 0 and standard deviation = 1
Round the number with 2 decimal places three times

rnorm(3) |> 
        round(2)

[1] -1.52  0.39 -0.52

rnorm(3) |> 
        round(2)

[1] -1.32  0.27 -0.06

rnorm(3) |> 
        round(2)

[1] -0.12  0.02  0.19

You see that ever time you draw random numbers, you receive different numbers
If you a simulation onece, there is no problem
But, if you want to get the same simulation result for some reason, then you must be happy to have the same result by fixing random number to get
This is when you use set.seed()
If you use the same set.seed(), then you get the same random number
In the parenthesis, you set a numeric scalar (which means, any nubmers you like)
Let’s set a numeric scalar, 20220826 within set.seed()

set.seed(20220826)
rnorm(3) |> 
        round(2)

[1] -0.14 -0.87 -0.29

set.seed(20220826)
rnorm(3) |> 
        round(2)

[1] -0.14 -0.87 -0.29

Once you set seed, then you can fix the random number and get the same result in your simulation

2. Birthday Paradox

In probability theory, the birthday problem asks for the probability that, in a set of N randomly chosen people, at least two will share a birthday
Let’s think about the question: how many people in a group we need to have a shared birthday?

Answer The probability of a shared birthday exceeds 50% in a group of only 23 people.

The birthday paradox is counterintuitive, and a veridical paradox: it appears wrong, but is in fact true
While it may seem surprising that only 23 individuals are required to reach a 50% probability of a shared birthday, this result is made more intuitive by considering that the comparisons of birthdays will be made between every possible pair of individuals.

Let’s confirm `Birthday Paradox` using R

Set the number of student: n_student
Draw multiple students from number 1 to 365
Suppose we have 10 student: n_student = 10

n_student <- 10
Birth_vec <- sample(1:365,
                    n_student,
                    replace = TRUE)
Birth_vec

 [1] 140  55 363 170 292 335 345   6 179   6

If you see a pair of number in this group
→　There a shared birthday

`duplicated()`

We can check whether there is a shared birthday by using duplicated()
If there is a shared birthday, then we receive TRUE, FALSE otherwise

duplicated(Birth_vec)

 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE

We see 9 FALSE and 1 TRUE
TRUE means that there is a shared birthday
→ Actually, the 8th student and 10th student have the same birthday (which is the 6 in the 365th DOB)
→ We have one shared birthday in this group

`any()`

any() enables us to check whether or not there is at least one TRUE
→ TRUE means that there is at least one shared birthday
→ FALSE means that there is no TRUE

any(duplicated(Birth_vec))

[1] TRUE

You get TRUE because the 6th DOB is a shared birthday
Let’s repeat this 100 times

n_student <- 10   # the number of students
n_trials  <- 100  # the number of trials  


Result_vec <- rep(NA, n_trials) # Prepare for  the place where we put our simulation result: Result_vec

# iteration  
for (i in 1:n_trials) {
    Birth_vec     <- sample(1:365, n_student, replace = TRUE)
    Result_vec[i] <- any(duplicated(Birth_vec))
}

Result_vec

  [1] FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE
 [13] FALSE  TRUE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE
 [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [49] FALSE  TRUE FALSE  TRUE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE  TRUE
 [61] FALSE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE
 [73] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [85] FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE
 [97] FALSE FALSE FALSE FALSE

sum(Result_vec)

[1] 12

We get 12 TRUE
→　12 out of 100 (12%) share the same birthday
Let’s repeat this 10,000 times
→　We can approximate probability

n_student <- 10    # the number of students
n_trials  <- 1000 # the number of trials  

# Prepare for  the place where we put our simulation result: Result_vec
Result_vec <- rep(NA, n_trials)

# iteration
for (i in 1:n_trials) {
    Birth_vec     <- sample(1:365, n_student, replace = TRUE)
    Result_vec[i] <- any(duplicated(Birth_vec))
}

sum(Result_vec)

[1] 110

We get 1190 TRUE
→ 1,190 out of 10,000 (11.9%) share the same birthday
→ If there are 11 people, then the probability to have a shared birthday is 11.9%

2.1 Comparison between simulation result and theoretical value

2.1.1 Simulation of birthday paradox

Let’ fix the number of trials as 100
Changing the number of students from 2 to 100, we want to see the probability that we have more than one shared birthday

Prob_df <- tibble(Students = 2:100, # change the number of students from 2 to 100
                  Probs    = NA) 

for (i in 1:nrow(Prob_df)) {
    
    Result_vec <- rep(NA, 100) # Assign the number of trials = 100
    
    for (j in 1:100) {        # Assign the number of trials = 100
        Birth_vec     <- sample(1:365, Prob_df$Students[i], replace = TRUE)
        Result_vec[j] <- any(duplicated(Birth_vec))
    }
    
    Prob_df$Probs[i] <- mean(Result_vec)
}

Check the simulation result

DT::datatable(Prob_df)

When we have 2 students, the probability that we have the shared birthday is zero (this makes perfect sense!)
Let’s check the number of students (Students) when the probability that we have at least one shared birthday (Probs) is 0.5 (=50%)

Prob_df |> 
  filter(Probs >= 0.4 & Probs <= 0.6)

# A tibble: 7 × 2
  Students Probs
     <int> <dbl>
1       19  0.43
2       20  0.43
3       21  0.5 
4       22  0.51
5       23  0.59
6       24  0.58
7       25  0.59

We see that the number of students (Students) when the probability that we have at least one shared birthday (Probs) is 0.5 (=50%) is around 21 or 22
Visualize the result by setting x-axis as the number of students (Students) and y-axis as the probability that we have at least one shared birthday (Probs)

Prob_df %>%
    ggplot() +
    geom_line(aes(x = Students, 
                  y = Probs * 100), 
              size = 1) +
    geom_hline(yintercept = 50, # Draw a red dotted line where Probs = 50%
               color = "red", 
               linetype = 2) +
    labs(x = "The number of Students",
         y = "The prob. that we have at least one shared birthday (Probs)  \n(Trials = 100)")

Simulation result (Trial = 100) The probability of a shared birthday exceeds 50% in a group of only 23 people

2.1.2 Theoretical value of birthday paradox

The probability that you have more than one shared birthday \(P(n)\) when there are n people

\[P(n) = 1 - \frac{365!}{365^n(365-n)!}\]

\(3! = 3 × 2 × 1 = 6\)
→　\(365! = 365 × 364 ×....× 1\)
In R, you can calculate this value by using factorial()
For instance, if you have 50 people, then \(P(n)\) is calculated as follows:

1 - factorial(365) / (365^50 * factorial(365 - 50))

[1] NaN

NaN means Not calculable
Because the number is too big, R cannot calculate it

Inf ・Let’s calculate \(100!\) on R

factorial(100)

[1] 9.332622e+157

This is a gigantic number with 157 decimal points!!
Let’s calculate \(200!\) on R

factorial(200)

[1] Inf

Inf means that R fails to calculate because the number is goo large

Let’s change the mathematical formula to a calculable one

\[P(n) = 1 - \frac{n!×_{365}C_n}{365^n(365-n)!}\]

C means binomial coefficient

\[_nC_k = \frac{n!}{k!(n-k)!}\]

In R, you can calculate C by using choose(n, k)

1 - ((factorial(50) * choose(365, 50)) / 365^50)

[1] 0.9703736

We see when there are 50 students, then the probability that we have at least one shared birthday (Probs) is 0.97 (= 97%)
Let’s calculate the probability that we have at least one shared birthday when there are 22 students and 23 students respectively

Birth_Expect <- function (n) {
  1 -  ((factorial(n) * choose(365, n)) / 365^n)
}
Birth_Expect(22)

[1] 0.4756953

Birth_Expect <- function (n) {
  1 - ((factorial(n) * choose(365, n)) / 365^n)
}
Birth_Expect(23)

[1] 0.5072972

We see that the probability that we have at least one shared birthday is when there are 23 students
Let’s add this theoretical value to the data frame (Prob_df) we made in 2.1.1 Simulation of birthday paradox and name it Expect

Prob_df <- Prob_df |> 
  mutate(Expect = map_dbl(Students,
                          ~Birth_Expect(.x)))

DT::datatable(Prob_df)

Using pivot_longer(), let’s customize this data to a tidy one

Prob_tidy <- Prob_df |> 
  pivot_longer(cols = Probs:Expect,
               names_to = "Type",
               values_to = "Prob")

DT::datatable(Prob_tidy)

Let’s visualize the result gained from Monte Carlo simulation (Probs) and the result gained as theoretical value (Expect)

Prob_tidy %>%
    mutate(Type = ifelse(Type == "Probs", "simulation", "theory")) %>%
    ggplot() +
    geom_line(aes(x = Students, 
                  y = Prob * 100, 
                  color = Type), 
              size = 1) +
    labs(x     = "The number of students",
        y = "The prob. that we have at least one shared birthday (Probs)  \n(Trials = 100)", 
         color = "") +
    theme(legend.position = "bottom")

Let’s increase the number of trials to 1000 and visualize it

n_student <- 100    # the number of students
n_trials  <- 1000   # the number of trials  

# Prepare for the place where we put our simulation result: Result_vec
Result_vec <- rep(NA, n_trials)

# iteration
for (i in 1:n_trials) {
    Birth_vec     <- sample(1:365, 
                            n_student, 
                            replace = TRUE)
    Result_vec[i] <- any(duplicated(Birth_vec))
}

Prob_df <- tibble(Students = 2:100,
                  Probs    = NA)

for (i in 1:nrow(Prob_df)) {
    
    Result_vec <- rep(NA, n_trials)
    
    for (j in 1:n_trials) {
        Birth_vec     <- sample(1:365, Prob_df$Students[i], replace = TRUE)
        Result_vec[j] <- any(duplicated(Birth_vec))
    }
    
    Prob_df$Probs[i] <- mean(Result_vec)
}

Prob_df %>%
    mutate(Expect = map_dbl(Students, ~Birth_Expect(.x))) %>%
    pivot_longer(cols      = Probs:Expect,
                 names_to  = "Type",
                 values_to = "Prob") |> 
    mutate(Type = ifelse(Type == "Probs", "Simulation", "Theoretical value")) %>%
    ggplot() +
    geom_line(aes(x = Students, y = Prob * 100, color = Type), size = 1) +
    labs(x     = "The number of Students",
          y = "The prob. that we have at least one shared birthday (Probs)  \n(Trials = 1000)", 
         color = "") +
    theme_minimal(base_family = "HiraKakuProN-W3") +
    theme(legend.position = "bottom")

We see that the result gained from our Monte Carlo simulation and the theoretical value looks almost the same

3. Probability and Area

3.1 How to calculate Area of a Circle without using \(π\)

Let’s consider a circle of radius 1 centered at the origin, find its area without using \(π\)
Step 1: Draw 4 lines around the circle and make a square
Step 2: Randomly add points inside of the square
Step 3: Calculate the ratio of dots inside and outside of the square
→ We get the ratio of the area inside and outside of the square
→ We can calculate the are of the circle
→ We can estimate the value of \(\pi = 3.14\)

Source: https://www.youtube.com/watch?v=fGWH4Oh0my0

Using a Monte Carlo simulation, let’s calculate \(\pi\)

The are of the square = 2 × 2 = 4
Using R, randomly add 100 dots inside of the square

set.seed(2022)
pi_df <- tibble(x = runif(100, -1, 1),
                y = runif(100, -1, 1))
pi_df

# A tibble: 100 × 2
         x      y
     <dbl>  <dbl>
 1  0.632   0.910
 2  0.295  -0.594
 3 -0.759   0.681
 4  0.0876  0.622
 5 -0.631   0.148
 6  0.272  -0.585
 7 -0.851   0.784
 8 -0.916  -0.831
 9 -0.259   0.243
10  0.515   0.968
# … with 90 more rows

pi_df |> 
  ggplot() +
    geom_rect(aes(xmin = -1, 
                  ymin = -1, 
                  xmax = 1, 
                  ymax = 1),
              fill = "white", color = "black") +
    geom_point(aes(x = x, y = y)) +
    coord_fixed(ratio = 1) +
    theme_minimal()

Add a circle to the square
When drawing a circle, you can use geom_circle() which comes with {ggforce}
Assign x = 0, y = 0 and the radius of the circle: \(r = 1\)

pi_df %>%
    ggplot() +
    geom_rect(aes(xmin = -1, 
                  ymin = -1, 
                  xmax = 1, 
                  ymax = 1),
              fill = "white", color = "black") +
    geom_circle(aes(x0 = 0, 
                    y0 = 0, 
                    r = 1)) +
    geom_point(aes(x = x, y = y)) +
    coord_fixed(ratio = 1) +
    theme_minimal()

Differently color the dots inside and outside of the circle
In order to do so, we make a vector, in_circle
The following is the formula of the circle:

\[(x-x')^2 + (y - y')^2 < r^2\]

Since the circle of radius 1 centered at the origin, we can express the circle as follows:

\[x^2 + y^2 < 1^2\]

Using in_circle, you can decide whether or not the random dots satisfy this mathematical equation

pi_df <- pi_df %>%
    mutate(in_circle = if_else(x^2 + y^2 < 1^2, "Inside", "Outside"))

3.2 The probability of Dots’ being inside and outside equals to Area

Let’s display the dots being inside and outside of the circle

pi_df %>%
    ggplot() +
    geom_rect(aes(xmin = -1, 
                  ymin = -1, 
                  xmax = 1, 
                  ymax = 1),
              fill = "white", 
              color = "black") +
    geom_point(aes(x = x, 
                   y = y, 
                   color = in_circle), 
               size = 2) +
    geom_circle(aes(x0 = 0, 
                    y0 = 0, 
                    r = 1)) +
    labs(x = "X", y = "Y", color = "") +
    coord_fixed(ratio = 1) +
    theme_void(base_size = 12)

Using group_by(), you can count the number of dots inside and outside of the circle

pi_df |> 
  group_by(in_circle) |> 
  summarise(N = n())

# A tibble: 2 × 2
  in_circle     N
  <chr>     <int>
1 Inside       82
2 Outside      18

You see that 82 dots (82%) inside of the circle out of 100 dots.
The area of the square is 2 × 2 = 4
→ The are of the circle = \(4 × \frac{82}{100} = 3.28\)
Since the radius of the circle is 1, this is equivalent to \(π\)

4 * 0.82

[1] 3.28

If you increase the number of dots, they can approximate a more accurate value of the \(π\)
Let’s approximate the value of \(π\) with 2000 dots

set.seed(2022)
pi_df2 <- tibble(x = runif(2000, -1, 1),
                 y = runif(2000, -1, 1))

pi_df2 <- pi_df2 %>%
    mutate(in_circle = if_else(x^2 + y^2 < 1^2, "Inside", "Outside"))

pi_df2 %>%
    ggplot() +
    geom_rect(aes(xmin = -1, 
                  ymin = -1, 
                  xmax = 1, 
                  ymax = 1),
              fill = "white", color = "black") +
    geom_point(aes(x = x, 
                   y = y, 
                   color = in_circle,
                   alpha = 0.5), 
               size = 2) +
    geom_circle(aes(x0 = 0, 
                    y0 = 0, 
                    r = 1)) +
    labs(x = "X", y = "Y", color = "") +
    coord_fixed(ratio = 1) +
    theme_void(base_size = 12)

pi_df2 %>%
    group_by(in_circle) %>%
    summarise(N = n())

# A tibble: 2 × 2
  in_circle     N
  <chr>     <int>
1 Inside     1562
2 Outside     438

You see that 1562 dots inside of the circle and 438 dots outside of the circle out of 2000 dots
The are of the square is 4
→ The are of the area = \(4 × \frac{1562}{2000} = 3.124\)
Since the radius of the circle is 1, this is equivalent to \(π\)

4 * 1562 / 2000

[1] 3.124

It is getting more to 3.14

3.3 Monte Carlo Simulation on \(\pi\)

Changing the number of dots to 2000, let’s estimate the value of \(\pi\)

dots <- 2:2000   # Change the number of dots from 2 to 2000
n <- length(dots)
results <- rep(NA, n)

for (i in 1:n) {
  results[i] <- tibble(x = runif(dots[i], -1, 1),
                       y = runif(dots[i], -1, 1)) |> 
    mutate(in_circle = if_else(x^2 + y^2 < 1^2, "Inside", "Outside")) |> 
    group_by(in_circle) |> 
    summarise(N = n()) |> 
    filter(in_circle == "Inside") |>  # Select the dots insides of the circle
    summarise(mean(N/dots[i]*4))  # Estimate pi  
  } 

pi <- unlist(results)       # Transform the date into vector
df_pi <- tibble(dots, pi) # Make a data frame, df_pi  

df_pi %>%
  ggplot() +
  geom_line(aes(x = dots, 
                y = pi), 
            size = 0.2,
            color = "blue",
            alpha = 0.8) +
  geom_hline(yintercept = 3.14, # Draw the line = 3.14
             color = "red", 
             linetype = 2) +
  labs(x = "The number of dots",
       y = "Estimated pi") +
  ggtitle("Estimated pi gained from Monte Carlo Simulation")

As the number of dots increases, you get the value closer to 3.14

5. Monty Hall Problem

Let’s think about Monty Hall problem
The Monty Hall problem is a brain teaser, in the form of a probability puzzle, loosely based on the American television game show Let’s Make a Deal and named after its original host, Monty Hall.

【Rules】

Suppose you’re on a game show, and you’re given the choice of three doors, 1, 2, and 3.
Behind one door is a car; behind the others, goats.
You pick a door, say No. 1, and the host, who knows what’s behind the doors, opens another door, say No. 3, which has a goat.
He then says to you, “Do you want to stay at door No. 1 or pick door No.2?”
Is it to your advantage to switch your choice?
Here you have the following two choice:

Stay:
You should stay, because there is no difference between picking door No.1 and door No.2 in terms of winning a car.

Move:
You should pick door No.2 because it enhances your chance of winning a car.

Watch this video and think which Choice sound right.

【Simulation on R】

Let’s suppose a car is behind the door No.1.
You, a player, randomly choose a door out of he three doors.
If there is no car behind the door you chose, Monty open one of the two doors which do not have a car.
You see a goat behind that door.
Now you have the following two options:

Stay at the door you initially picked
Move to the door Monty suggested

Player are supposed to play this game 200 times
Calculate the probability that you win a car in case you choose to stay and in case you choose to move.

Script for Monty Hall Problem on R

set.seed(1838-3-11) # Set a random number, Okuma's DOB

# Player are supposed to play this game 200 times 
trials <- 200　 
prize <- sample(1:3, trials, replace = TRUE) 
prize
stay <- prize == 1 # Define that a car is behind door No.1
head(stay)
move <- prize != 1 # Define that a car is not behind door No.2 and No.3  
head(move)
prob.stay <- prob.move <- rep(NA, trials) 
for (i in 1:trials) {
prob.stay[i] <- sum(stay[1:i]) / i # Prob. of winning a car if you stay
prob.move[i] <- sum(move[1:i]) / i # Prob . of winning a car if you move

# Plot your simulation results
plot(1:trials, prob.move, ylim = c(0, 1),  
     type = "l", col = "red", 
     xlab = "number of trials",
     ylab = "prob. of getting a car",
     main = "Simulation of Monty Hall Problem")
 lines(1:trials, prob.stay, type = "l", col = "blue")
 abline(h = c(0.33, 0.66), lty = "dotted") # Draw two lines with prob = 33 % and prob = 66%
 legend("topright", lty = 1, col = c("red", "blue"),
       legend = c("change", "stay"))}

・How to run this simulation:
1. Go to File on R (not RStudio) 2. Click New File
3. Paste the R code above (Script for Monty Hall Problem on R) 4. Hit the return key or enter key and you get the following results:

References

宋財泫 (Jaehyun Song)「モンテカルロ・シミュレーション入門」
宋財泫 (Jaehyun Song)・矢内勇生 (Yuki Yanai)「私たちのR: ベストプラクティスの探究」
土井翔平（北海道大学公共政策大学院）「Rで計量政治学入門」
矢内勇生（高知工科大学）授業一覧
浅野正彦, 矢内勇生.『Rによる計量政治学』オーム社、2018年
浅野正彦, 中村公亮.『初めてのRStudio』オーム社、2018年
Winston Chang, R Graphics Cookbook, O’Reilly Media, 2012.
Kieran Healy, DATA VISUALIZATION, Princeton, 2019
Kosuke Imai, Quantitative Social Science: An Introduction, Princeton University Press, 2017

42. Monte Carlo experiments

Masahiko Asano

2022-10-11

1. Monte Carlo experiments

1.1 Generating random number: `sample()`

(1) Generate random number by drawing values from a group of numbers

1.2 Generating random numbers from probability distributios

1.3 Set seed in R

2. Birthday Paradox

Let’s confirm `Birthday Paradox` using R

`duplicated()`

`any()`

2.1 Comparison between simulation result and theoretical value

2.1.1 Simulation of birthday paradox

2.1.2 Theoretical value of birthday paradox

3. Probability and Area

3.1 How to calculate Area of a Circle without using \(π\)

Using a Monte Carlo simulation, let’s calculate \(\pi\)

3.2 The probability of Dots’ being inside and outside equals to Area

3.3 Monte Carlo Simulation on \(\pi\)

5. Monty Hall Problem

42. Monte Carlo experiments

Masahiko Asano

2022-10-11

1. Monte Carlo experiments

1.1 Generating random number: sample()

(1) Generate random number by drawing values from a group of numbers

1.2 Generating random numbers from probability distributios

1.3 Set seed in R

2. Birthday Paradox

Let’s confirm Birthday Paradox using R

duplicated()

any()

2.1 Comparison between simulation result and theoretical value

2.1.1 Simulation of birthday paradox

2.1.2 Theoretical value of birthday paradox

3. Probability and Area

3.1 How to calculate Area of a Circle without using \(π\)

Using a Monte Carlo simulation, let’s calculate \(\pi\)

3.2 The probability of Dots’ being inside and outside equals to Area

3.3 Monte Carlo Simulation on \(\pi\)

5. Monty Hall Problem

1.1 Generating random number: `sample()`

Let’s confirm `Birthday Paradox` using R

`duplicated()`

`any()`