Statistics with R
- Statistics with R
- R Objects, Numbers, Attributes, Vectors, Coercion
- Matrices, Lists, Factors
- Data Frames in R
- Control Structures in R
- Functions in R
- Data Basics: Compute Summary Statistics in R
- Central Tendency and Spread in R Programming
- Data Basics: Plotting – Charts and Graphs
- Normal Distribution in R
- Skewness of statistical data
- Bernoulli Distribution in R
- Binomial Distribution in R Programming
- Compute Randomly Drawn Negative Binomial Density in R Programming
- Poisson Functions in R Programming
- How to Use the Multinomial Distribution in R
- Beta Distribution in R
- Chi-Square Distribution in R
- Exponential Distribution in R Programming
- Log Normal Distribution in R
- Continuous Uniform Distribution in R
- Understanding the t-distribution in R
- Gamma Distribution in R Programming
- How to Calculate Conditional Probability in R?
- How to Plot a Weibull Distribution in R
- Hypothesis Testing in R Programming
- T-Test in R Programming
- Type I Error in R
- Type II Error in R
- Confidence Intervals in R
- Covariance and Correlation in R
- Covariance Matrix in R
- Pearson Correlation in R
- Normal Probability Plot in R
Bernoulli Distribution in R
The Bernoulli distribution is a probability distribution that describes the outcome of a single binary experiment, where the result can be one of two possible outcomes, typically labeled as success or failure. It is named after Swiss mathematician Jacob Bernoulli.
The Bernoulli distribution is defined by a single parameter, usually denoted by p, which represents the probability of success.
The probability of failure, q, is simply 1 – p.
The formula for the Bernoulli distribution is as follows:
P(X = 1) = p
P(X = 0) = 1 – p
where X is the random variable that takes on the value 1 for success and 0 for failure.
In other words, the probability of success is p and the probability of failure is 1-p. The expected value (mean) of the Bernoulli distribution is p, and the variance is p(1-p).
The Bernoulli distribution is a probability distribution that represents the probability of a binary outcome, such as the flip of a coin or the success or failure of an event. In R, there are several functions that can be used to calculate the Bernoulli distribution.
The probability mass function (PMF) of the Bernoulli distribution is defined as:
P(X = x) = p^x * (1-p)^(1-x) for x in {0,1}
In R, you can use the following functions to work with the Bernoulli distribution:
1. dbinom(x, size, prob)
in R
dbinom(x, size, prob)
: computes the probability mass function (PMF) of the Bernoulli distribution at x, where size is the number of trials and prob is the probability of success.
Example: Compute the probability of getting exactly 2 heads in 5 tosses of a fair coin (p = 0.5) using the Bernoulli distribution.
x <- 2 n <- 5 p <- 0.5 dbinom(x, n, p)
Output:
[1] 0.3125
Here is another example code that plots the PMF for p = 0.2, p = 0.5, and p = 0.8:
probs <- c(0.2, 0.5, 0.8) # set the probabilities of success x <- 0:1 # set the possible number of successes # calculate the PMF for each probability of success pmf <- sapply(probs, function(p) dbinom(x, 1, p)) # create the plot plot( x, pmf[, 1], type = "h", lwd = 2, ylim = c(0, 1), xlab = "Number of successes", ylab = "Probability", main = "Bernoulli Distribution PMF" ) lines(x, pmf[, 2], type = "h", lwd = 2, col = "blue") lines(x, pmf[, 3], type = "h", lwd = 2, col = "red") legend( "topright", legend = probs, col = c("black", "blue", "red"), lwd = 2, title = "Probability of success" )
2. pbinom(q, size, prob)
in R
pbinom(q, size, prob)
: computes the cumulative distribution function (CDF) of the Bernoulli distribution at q.
Example: Compute the probability of getting 2 or fewer heads in 5 tosses of a fair coin (p = 0.5) using the Bernoulli distribution.
q <- 2 n <- 5 p <- 0.5 pbinom(q, n, p)
Output:
[1] 0.6875
Here’s another example code for using the pbinom()
function to plot the cumulative distribution function (CDF) of a Bernoulli distribution:
# Set the probability of success p <- 0.3 # Set the possible number of successes x <- 0:1 # Calculate the CDF for each possible number of successes cdf <- sapply(x, function(k) pbinom(k, 1, p)) # Create the plot plot( x, cdf, type = "s", lwd = 2, ylim = c(0, 1), xlab = "Number of successes", ylab = "Cumulative Probability", main = "Bernoulli Distribution CDF" ) points(x, cdf, pch = 19) segments(x, 0, x, cdf, lty = 2)
3. qbinom(p, size, prob)
in R
qbinom(p, size, prob)
: computes the quantile function of the Bernoulli distribution at probability p.
Example: Find the number of trials required to get at least 2 heads with probability 0.7 using the Bernoulli distribution (p = 0.5).
p <- 0.7 n <- qbinom(p, 100, 0.5) n
Output:
[1] 3
Here’s an example of how to use qbinom
and plot the results:
# set the probability of success and number of trials p <- 0.3 n <- 10 # calculate the 0.05, 0.5, and 0.95 quantiles of the binomial distribution q <- qbinom(c(0.05, 0.5, 0.95), size = n, prob = p) # plot the cumulative distribution function (CDF) of the binomial distribution x <- 0:n y <- pbinom(x, size = n, prob = p) plot(x, y, type = "s", xlab = "Number of successes", ylab = "Cumulative probability") # add vertical lines for the quantiles abline( v = q, col = c("red", "blue", "green"), lty = c(1, 2, 3) )
4. rbinom(n, size, prob)
in R
rbinom(n, size, prob)
: generates n random values from a Bernoulli distribution with parameter p.
Example: Generate 100 random values from a Bernoulli distribution with probability of success (p) 0.3.
# Generate 100 random values from # a Bernoulli distribution with p = 0.3 set.seed(123) # Setting seed for reproducibility n <- 100 # Sample size p <- 0.3 # Probability of success x <- rbinom(n, 1, p) # Generate 100 random values from a Bernoulli distribution x
Output:
[1] 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0
0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Here’s another example of how to use rbinom
and plot the results in R:
# set the probability of success and number of trials p <- 0.3 n <- 10 # generate 1000 random samples from the binomial distribution samples <- rbinom(1000, size = n, prob = p) # plot a histogram of the samples hist( samples, breaks = seq(-0.5, n + 0.5, by = 1), freq = FALSE, main = "Histogram of Binomial Samples", xlab = "Number of Successes" ) # overlay the theoretical probability mass function (PMF) x <- 0:n pmf <- dbinom(x, size = n, prob = p) lines(x, pmf, type = "h", lwd = 2, col = "red")