Chi-Square Goodness of Fit Test – Formula, Guide & Examples

The Chi-Square Goodness of Fit test is a statistical test used to determine whether a set of observed data fits a particular theoretical distribution. It is also referred to as the Chi-Square test for uniformity, the Chi-Square test of independence or the Chi-Square test for goodness of fit.

The test compares the observed data with the expected data and determines whether the differences between them are statistically significant. The expected data are calculated based on a theoretical distribution, such as the normal distribution or the Poisson distribution.

What is the chi-square goodness of fit test?

A chi-square goodness-of-fit test can be conducted when there is one categorical variable with more than two levels. If there are exactly two categories, then a one proportion z test may be conducted. The levels of that categorical variable must be mutually exclusive. In other words, each case must fit into one and only one category.

We can test that the proportions are all equal to one another or we can test any specific set of proportions.

If the expected counts, which we’ll learn how to compute shortly, are all at least five, then the chi-square distribution may be used to approximate the sampling distribution. If any expected count is less than five, then a randomization test should be conducted.

The test is commonly used in fields such as biology, psychology, social science and data sciences to determine whether a sample of data is representative of a larger population. It can also be used to test the fairness of a game or to compare the performance of different models.

To perform a Chi-Square Goodness of Fit test, the following steps are typically followed:

1. State the null hypothesis (H0) and the alternative hypothesis (Ha).

2. Decide on the significance level (α).

3. Collect the observed data and calculate the expected data based on the theoretical distribution.

4. Calculate the Chi-Square test statistic.

5. Determine the degrees of freedom (df) for the test.

6. Determine the critical value from the Chi-Square distribution table using the df and the significance level.

7. Compare the calculated Chi-Square test statistic with the critical value.

8. Make a decision whether to reject or fail to reject the null hypothesis.

If the calculated Chi-Square test statistic is greater than the critical value, then the null hypothesis is rejected, and it is concluded that the observed data does not fit the theoretical distribution. On the other hand, if the calculated Chi-Square test statistic is less than or equal to the critical value, then the null hypothesis is not rejected, and it is concluded that the observed data fits the theoretical distribution.

It is important to note that the Chi-Square Goodness of Fit test assumes that the observed data is a random sample, the expected frequencies are greater than 5, and the data is independent. If these assumptions are not met, then the test results may be invalid.

Possible Research Questions

Here are a few examples of research questions that could be addressed using the Chi-Square Goodness of Fit Test:

1. A researcher wants to know if the number of COVID-19 cases reported in a particular region follows a Poisson distribution. The researcher collects data on the number of cases reported each day for a month and compares the observed frequency distribution to the expected Poisson distribution using the Chi-Square Goodness of Fit Test.

2. A psychologist wants to test if a group of individuals has a normal distribution of IQ scores. The researcher collects IQ scores from a sample of 100 individuals and compares the observed frequency distribution to the expected normal distribution using the Chi-Square Goodness of Fit Test.

3. A researcher wants to test if the distribution of different blood types in a population follows the expected distribution of blood types. The researcher collects blood type data from a sample of 500 individuals and compares the observed frequency distribution to the expected distribution using the Chi-Square Goodness of Fit Test.

4. A quality control manager wants to know if the weights of cereal boxes produced in a factory follow the expected normal distribution. The manager collects weights of a sample of 100 cereal boxes and compares the observed frequency distribution to the expected normal distribution using the Chi-Square Goodness of Fit Test.

5. A researcher wants to test if the distribution of gender among job applicants in a certain industry follows the expected distribution. The researcher collects gender data from a sample of 500 job applicants and compares the observed frequency distribution to the expected distribution using the Chi-Square Goodness of Fit Test.

Chi-square goodness of fit test hypotheses

The Chi-Square Goodness of Fit Test is a hypothesis test that is used to determine whether a set of observed data fits a particular theoretical distribution. The null and alternative hypotheses for this test can be stated as follows:

Null Hypothesis (H0): The observed data follows the expected distribution.
Alternative Hypothesis (Ha): The observed data does not follow the expected distribution.

In other words, the null hypothesis assumes that there is no significant difference between the observed and expected frequencies, while the alternative hypothesis assumes that there is a significant difference.

To perform the Chi-Square Goodness of Fit Test, we calculate the test statistic (Chi-Square statistic) using the observed and expected frequencies and then compare it to the critical value from the Chi-Square distribution with degrees of freedom equal to the number of categories minus one. If the calculated Chi-Square statistic is greater than the critical value, we reject the null hypothesis and conclude that the observed data does not fit the expected distribution. If the calculated Chi-Square statistic is less than or equal to the critical value, we fail to reject the null hypothesis and conclude that there is no evidence to suggest that the observed data does not fit the expected distribution.

When to use the chi-square goodness of fit test

The test is appropriate in situations where the researcher has a theoretical distribution in mind and wants to determine whether the observed data fits that distribution. The test is commonly used in fields such as biology, psychology, social sciences, quality control, and finance to test the goodness of fit of observed data to an expected distribution.

Examples of situations where the Chi-Square Goodness of Fit Test may be appropriate include:

Testing whether the observed frequency distribution of exam scores in a particular class fits a normal distribution.
Testing whether the observed frequency distribution of the number of defects in a sample of manufactured products fits a Poisson distribution.
Testing whether the observed frequency distribution of blood types in a particular population fits the expected distribution based on the Hardy-Weinberg equilibrium.
Testing whether the observed frequency distribution of calls to a customer service center during a certain time period fits a certain distribution.
Testing whether the observed frequency distribution of stock prices fits a particular distribution.

In summary, the Chi-Square Goodness of Fit Test is used to determine whether the observed data fits a particular theoretical distribution and is appropriate when the researcher has a theoretical distribution in mind and wants to determine whether the observed data fits that distribution.

How to calculate the test statistic (formula)

The test statistic for the Chi-Square Goodness of Fit Test is calculated using the following formula:

χ^2 = Σ((Oi – Ei)^2 / Ei)

Where:

χ^2 is the Chi-Square statistic.
Oi is the observed frequency in each category.
Ei is the expected frequency in each category, calculated based on the theoretical distribution.
Σ is the summation symbol, indicating that we sum over all categories.

The test statistic measures the difference between the observed and expected frequencies in each category, squared and divided by the expected frequency, and then summed over all categories. If the observed data fits the expected distribution, the test statistic will be small, and if the observed data does not fit the expected distribution, the test statistic will be large.

Once the test statistic is calculated, we compare it to the critical value from the Chi-Square distribution with degrees of freedom equal to the number of categories minus one. If the calculated Chi-Square statistic is greater than the critical value, we reject the null hypothesis and conclude that the observed data does not fit the expected distribution. If the calculated Chi-Square statistic is less than or equal to the critical value, we fail to reject the null hypothesis and conclude that there is no evidence to suggest that the observed data does not fit the expected distribution.

Chi-Square Goodness of Fit Hypothesis Testing Procedure

An example of a Chi-Square Goodness of Fit Hypothesis Test would be testing whether the observed frequencies of eye color in a population of individuals follow the expected distribution based on the Hardy-Weinberg equilibrium. The Hardy-Weinberg equilibrium is a theoretical distribution that predicts the expected frequency of different genotypes in a population.

Suppose we have a sample of 500 individuals, and we record the number of individuals with blue, green, and brown eyes as 200, 150, and 150, respectively. Based on the Hardy-Weinberg equilibrium, we would expect the frequencies of blue, green, and brown eyes to be 25%, 50%, and 25%, respectively.

Step 1: State the null and alternative hypotheses.

Null Hypothesis (H0): The observed data fits the expected distribution.
Alternative Hypothesis (Ha): The observed data does not fit the expected distribution.

Here, The null and alternative hypotheses for this example would be:

Null Hypothesis (H0): The observed frequencies of eye color in the population follow the expected distribution based on the Hardy-Weinberg equilibrium.

Alternative Hypothesis (Ha): The observed frequencies of eye color in the population do not follow the expected distribution based on the Hardy-Weinberg equilibrium.

Step 2: Choose the level of significance, α.

This determines the critical value for the Chi-Square distribution with degrees of freedom equal to the number of categories minus one.

The degrees of freedom for this test would be 3 – 1 = 2.

Step 3: Collect and record the observed frequencies for each category of interest.

For example, the number of individuals with blue, green, and brown eyes in a population.

Suppose we have a sample of 500 individuals, and we record the number of individuals with blue, green, and brown eyes as 200, 150, and 150, respectively.

Step 4: Determine the expected frequencies for each category.

This is based on the theoretical distribution that is being tested against, such as the Hardy-Weinberg equilibrium.
The expected frequencies can be calculated by multiplying the total sample size by the expected proportions for each category.

Based on the Hardy-Weinberg equilibrium, we would expect the frequencies of blue, green, and brown eyes to be 25%, 50%, and 25%, respectively.

We can calculate the expected frequencies by multiplying the total sample size by the expected proportions:

Expected frequency of blue eyes: 500 * 0.25 = 125
Expected frequency of green eyes: 500 * 0.50 = 250
Expected frequency of brown eyes: 500 * 0.25 = 125

Step 5: Calculate the Chi-Square statistic.

Use the formula: χ^2 = Σ((Oi – Ei)^2 / Ei)
Where Oi is the observed frequency in each category, and Ei is the expected frequency in each category.

Eye Color	Observed Frequency (Oi)	Expected Frequency (Ei)	(Oi – Ei)^2 / Ei
Blue	200	125	40.96
Green	150	250	36
Brown	150	125	1.44
Total	500	500	78.4

Using the Chi-Square Goodness of Fit Test formula, we can calculate the test statistic:

χ^2 = ((200 – 125)^2 / 125) + ((150 – 250)^2 / 250) + ((150 – 125)^2 / 125) = 37.6

Step 6: Determine the degrees of freedom for the Chi-Square distribution.

This is equal to the number of categories minus one. The degrees of freedom for this test would be 3 – 1 = 2.

Step 7: Find the critical value for the Chi-Square distribution with the degrees of freedom and level of significance.

This can be done using a Chi-Square distribution table or statistical software.

We can use a Chi-Square distribution table or statistical software to find the critical value of Chi-Square for a significance level of 0.05 and 2 degrees of freedom, which is 5.99.

Step 8: Compare the calculated Chi-Square statistic to the critical value.

If the calculated Chi-Square statistic is greater than the critical value, reject the null hypothesis and conclude that the observed data does not fit the expected distribution.
If the calculated Chi-Square statistic is less than or equal to the critical value, fail to reject the null hypothesis and conclude that there is no evidence to suggest that the observed data does not fit the expected distribution.

We can use a Chi-Square distribution table or statistical software to find the critical value of Chi-Square for a significance level of 0.05 and 2 degrees of freedom, which is 5.99.

Step 9: Interpret the results.

If the null hypothesis is rejected, explain the implications of the result and potential reasons why the observed data did not fit the expected distribution.
If the null hypothesis is not rejected, explain that the observed data fits the expected distribution and interpret the implications of this result.

Since our calculated Chi-Square statistic of 37.6 is greater than the critical value of 5.99, we reject the null hypothesis and conclude that the observed frequencies of eye color in the population do not follow the expected distribution based on the Hardy-Weinberg equilibrium.

Example 2 – Chi-Square Goodness of Fit Hypothesis

Suppose we have a company that produces four different colors of candy (Red, Green, Blue, and Yellow) and we want to test whether the proportion of each color produced is consistent with the expected proportions of 25% for each color.

Our null hypothesis (H0) is that the proportion of each color produced is consistent with the expected proportions of 25% for each color. Our alternative hypothesis (Ha) is that the proportion of at least one color is not consistent with the expected proportion of 25%.

The observed frequencies of candy produced for each color are:

Red: 140
Green: 120
Blue: 100
Yellow: 140

Step 1: Set up the hypothesis

The null hypothesis is: H0: The proportion of each color produced is consistent with the expected proportions of 25% for each color.

The alternative hypothesis is: Ha: The proportion of at least one color is not consistent with the expected proportion of 25%.

Step 2: Choose the significance level

Let’s choose a significance level of 0.05. This means we are willing to accept a 5% chance of making a Type I error (rejecting the null hypothesis when it is true).

Step 3: Calculate the expected frequencies

To calculate the expected frequencies, we first need to determine the total number of candies produced:

Total candies = 140 + 120 + 100 + 140 = 500

The expected frequency for each color is:

Expected frequency = Total candies * Expected proportion = 500 * 0.25 = 125

So, the expected frequency for each color is 125.

Step 4: Calculate the test statistic

The test statistic for the Chi-Square Goodness of Fit test is calculated as follows:

χ2 = Σ [(O – E)2 / E]

where O is the observed frequency, E is the expected frequency, and Σ represents the sum over all categories.

Using the observed and expected frequencies, we can calculate the test statistic as follows:

χ2 = [(140 – 125)2 / 125] + [(120 – 125)2 / 125] + [(100 – 125)2 / 125] + [(140 – 125)2 / 125] = 4.48

Step 5: Determine the degrees of freedom

The degrees of freedom for the Chi-Square Goodness of Fit test are calculated as: df = k – 1

where k is the number of categories. In this case, k = 4, so the degrees of freedom are 3.

Step 6: Determine the critical value

We need to determine the critical value of the Chi-Square distribution with 3 degrees of freedom and a significance level of 0.05. Using a Chi-Square distribution table, we find the critical value to be 7.815.

Step 7: Make a decision and interpret the results

The test statistic (χ2 = 4.48) is less than the critical value (7.815), so we fail to reject the null hypothesis. We do not have sufficient evidence to conclude that the proportion of each color produced is different from the expected proportions of 25% for each color.

In conclusion, based on the Chi-Square Goodness of Fit test, we cannot reject the null hypothesis that the proportion of each color produced is consistent with the expected proportions of 25% for each color at a significance level of 0.05.

What is difference between Chi-Square Goodness of Fit test vs Chi-Square Test?

The Chi-Square Goodness of Fit Test and the Chi-Square Test are two different statistical tests that use the Chi-Square distribution. A chi-square (Χ2) goodness of fit test is a type of Pearson’s chi-square test. You can use it to test whether the observed distribution of a categorical variable differs from your expectations.

The Chi-Square Goodness of Fit Test is used to determine whether a set of observed data fits a particular theoretical distribution. It compares the observed data with the expected data and determines whether the differences between them are statistically significant. The test is commonly used in fields such as biology, psychology, and social sciences to determine whether a sample of data is representative of a larger population.

On the other hand, the Chi-Square Test of Independence (or just Chi-Square Test) is used to determine whether there is a significant association between two categorical variables. It compares the observed frequencies of the two variables with the expected frequencies assuming that there is no association between them. The test is commonly used in fields such as market research, social sciences, and public health to determine whether there is a relationship between two variables.

The main difference between the two tests is that the Chi-Square Goodness of Fit Test is used to test whether observed data fits a particular distribution, while the Chi-Square Test is used to test for independence between two categorical variables. Additionally, the Chi-Square Goodness of Fit Test involves comparing observed and expected frequencies for one variable, while the Chi-Square Test involves comparing observed and expected frequencies for two variables.

Inferential Statistics