Statistics with R
- Statistics with R
- R Objects, Numbers, Attributes, Vectors, Coercion
- Matrices, Lists, Factors
- Data Frames in R
- Control Structures in R
- Functions in R
- Data Basics: Compute Summary Statistics in R
- Central Tendency and Spread in R Programming
- Data Basics: Plotting – Charts and Graphs
- Normal Distribution in R
- Skewness of statistical data
- Bernoulli Distribution in R
- Binomial Distribution in R Programming
- Compute Randomly Drawn Negative Binomial Density in R Programming
- Poisson Functions in R Programming
- How to Use the Multinomial Distribution in R
- Beta Distribution in R
- Chi-Square Distribution in R
- Exponential Distribution in R Programming
- Log Normal Distribution in R
- Continuous Uniform Distribution in R
- Understanding the t-distribution in R
- Gamma Distribution in R Programming
- How to Calculate Conditional Probability in R?
- How to Plot a Weibull Distribution in R
- Hypothesis Testing in R Programming
- T-Test in R Programming
- Type I Error in R
- Type II Error in R
- Confidence Intervals in R
- Covariance and Correlation in R
- Covariance Matrix in R
- Pearson Correlation in R
- Normal Probability Plot in R
Normal Probability Plot in R using ggplot2
A normal probability plot, also known as a quantile-quantile (Q-Q) plot, is a graphical method for comparing a set of data to a normal distribution. If the data follows a normal distribution, the points in the plot will fall approximately along a straight line.
To create a Normal Probability Plot in R using ggplot2, you can use the ggplot() function to create the plot and then add the data using the stat_qq() function.
Here’s an simple example:
library(ggplot2) # Create some sample data x <- rnorm(100) # Create the normal probability plot ggplot(data.frame(x), aes(sample = x)) + stat_qq() + stat_qq_line() + labs(title = "Normal Probability Plot")
In this example, we first generate a random sample of 100 observations from a normal distribution using the rnorm() function. We then create the plot using ggplot() and specify the sample variable using aes(). We add the normal probability plot using stat_qq() and the reference line using stat_qq_line(). Finally, we add a title using labs(). You can customize the plot further by adding axis labels and adjusting the appearance of the plot using the various ggplot() functions.

Example 2:
Here is another example of normal probability plot in R using ggplot2, you can use the ggplot() function and the stat_qq() function from the ggplot2 package. In this example we will use built-in mtcars dataset.
library(ggplot2)
# Create a normal probability plot of the
# mpg variable in the mtcars dataset
ggplot(mtcars, aes(sample = mpg)) +
stat_qq() +
ggtitle("Normal Probability Plot of MPG in the mtcars Dataset")
This code will create a normal probability plot of the mpg variable in the mtcars dataset and add a title to the plot. You can customize the plot by adding additional layers, changing the title and axis labels, and adjusting other plot aesthetics.
If you want to add a reference line to the plot that represents a perfectly normal distribution, you can use the stat_qq_line() function like this:
ggplot(mtcars, aes(sample = mpg)) +
stat_qq() +
stat_qq_line() +
ggtitle("Normal Probability Plot of MPG in the mtcars Dataset")
Example 3:
Now you know that, a normal probability plot, also called a Q-Q plot (quantile-quantile plot), is used to assess if a dataset follows a normal distribution. To create a normal probability plot in R using ggplot2, you’ll need to follow these steps:
- Install and load required libraries.
- Create a dataset or load existing data.
- Calculate the theoretical quantiles and sort the data.
- Create a ggplot with the sorted data and theoretical quantiles.
- Add the reference line (45-degree line) to the plot.
Here’s another complete example:
# Step 1: Install and load required libraries
if (!requireNamespace("ggplot2", quietly = TRUE)) {
install.packages("ggplot2")
}
if (!requireNamespace("dplyr", quietly = TRUE)) {
install.packages("dplyr")
}
library(ggplot2)
library(dplyr)
# Step 2: Create a dataset or load existing data
# Here, we generate a random dataset following a normal distribution
set.seed(42)
data <- rnorm(100, mean = 0, sd = 1)
# Step 3: Calculate the theoretical quantiles and sort the data
data <- data.frame(sample = data) %>%
mutate(rank = rank(sample)) %>%
arrange(rank) %>%
mutate(qq = qnorm((rank - 0.5) / length(sample)))
# Step 4: Create a ggplot with the sorted data and theoretical quantiles
normal_probability_plot <- ggplot(data, aes(x = qq, y = sample)) +
geom_point() +
xlab("Theoretical Quantiles") +
ylab("Sample Quantiles") +
ggtitle("Normal Probability Plot")
# Step 5: Add the reference line (45-degree line) to the plot
normal_probability_plot <- normal_probability_plot +
geom_abline(
intercept = 0,
slope = 1,
color = "red",
linetype = "dashed"
) +
theme_bw()
# Display the plot
print(normal_probability_plot)
This code will create a normal probability plot for a dataset following a normal distribution. If the points in the plot lie close to the 45-degree red reference line, it suggests that the data is normally distributed.

