p-values

Understanding P-values – Definition and Examples

What exactly is a value?

In statistics, a p-value is a measure of the evidence against a null hypothesis. The null hypothesis is the default assumption that there is no significant difference or relationship between two variables or groups being studied. The p-value is the probability of obtaining results as extreme or more extreme than the observed results, assuming that the null hypothesis is true.

Making a Decision

A p-value is typically used in hypothesis testing to determine the significance of the results obtained from a statistical analysis. The p-value is compared to a predetermined level of significance (α), usually set at 0.05, to determine whether to reject or fail to reject the null hypothesis.

  • If the p-value is less than or equal to the level of significance, then the results are considered statistically significant, and the null hypothesis is rejected.
  • If the p-value is greater than the level of significance, then the results are not statistically significant, and the null hypothesis is not rejected.

p-value is represented by the symbol “p”. It is calculated based on the observed test statistic, which is typically denoted by “t” for t-tests, “F” for ANOVA and regression models, or “z” for z-tests.

The null hypothesis is denoted by “H0”, while the alternative hypothesis is denoted by “Ha”. The level of significance, which is often set at 0.05, is represented by the symbol “α”.

Thus, the formula for calculating the p-value is:

p = P(observed test statistic | H0 is true)

where “P” represents the probability function. The p-value is then compared to the level of significance, α, to determine whether to reject or fail to reject the null hypothesis. If p ≤ α, then the null hypothesis is rejected in favor of the alternative hypothesis. If p > α, then the null hypothesis is not rejected.

For example, in a two-sample t-test comparing the means of two groups, the observed test statistic would be the t-value. The null hypothesis would be that the two groups have equal means, while the alternative hypothesis would be that the means are not equal. The p-value would be calculated as the probability of obtaining a t-value as extreme or more extreme than the observed value, assuming that the null hypothesis is true. If the p-value is less than or equal to the level of significance, α, then the null hypothesis is rejected, and the alternative hypothesis is accepted.

How do you calculate the p value?

The calculation of a p-value depends on the statistical test being used and the null hypothesis being tested. In general, the p-value is calculated as the probability of obtaining a test statistic as extreme or more extreme than the observed value, assuming that the null hypothesis is true.

Here are the general steps for calculating a p-value:

1. Determine the null hypothesis (H0) and the alternative hypothesis (Ha) being tested.

2. Calculate the test statistic for the data being analyzed. The test statistic depends on the type of test being used (e.g., t-test, ANOVA, chi-square test) and is typically calculated by subtracting the null hypothesis value from the sample estimate and dividing by the standard error of the estimate.

3. Determine the probability distribution of the test statistic assuming the null hypothesis is true. The distribution depends on the type of test being used and the assumptions of the statistical model.

4. Calculate the p-value as the probability of obtaining a test statistic as extreme or more extreme than the observed value. This involves finding the area in the tail(s) of the probability distribution that corresponds to the observed test statistic.

5. Compare the p-value to the level of significance (alpha) chosen for the test. If the p-value is less than or equal to alpha, then the null hypothesis is rejected in favor of the alternative hypothesis. If the p-value is greater than alpha, then the null hypothesis is not rejected.

It’s worth noting that there are many statistical software programs available that can calculate p-values automatically based on the input data and statistical test being used. However, understanding how to calculate a p-value manually can help with interpreting and verifying the results obtained from automated analyses.

P values and statistical significance

P-values are commonly used in statistical hypothesis testing to assess the evidence against a null hypothesis. The null hypothesis is a statement that there is no difference or relationship between two variables or groups being studied, and the alternative hypothesis is a statement that there is a difference or relationship.

The p-value is the probability of obtaining results as extreme or more extreme than the observed results, assuming that the null hypothesis is true. A smaller p-value indicates stronger evidence against the null hypothesis, and a larger p-value indicates weaker evidence against the null hypothesis.

Statistical significance refers to the level of evidence against the null hypothesis based on the p-value. A commonly used level of significance is 0.05, which means that if the p-value is less than or equal to 0.05, then the results are considered statistically significant, and the null hypothesis is rejected. If the p-value is greater than 0.05, then the results are not statistically significant, and the null hypothesis is not rejected.

It is important to note that statistical significance does not necessarily imply practical significance or importance. A statistically significant difference or relationship between variables may still have little or no practical significance, and a non-significant difference or relationship may still be practically significant. Therefore, it is important to interpret the results of statistical analyses in the context of the research question and the practical implications of the findings.

Caution when using values

While p-values are commonly used in statistical hypothesis testing, it is important to exercise caution when interpreting them, and to consider them in conjunction with other statistical measures and the overall context of the research question.

Here are some potential issues and limitations associated with p-values:

  1. P-values do not provide information about effect size: The p-value only indicates the strength of evidence against the null hypothesis, but does not provide information about the magnitude or practical significance of the observed effect. Therefore, it is important to also report effect size measures, such as Cohen’s d or Pearson’s r, which indicate the size and direction of the effect.
  2. P-values depend on the sample size: As the sample size increases, even small differences between groups or variables can become statistically significant. Therefore, it is important to consider the sample size when interpreting p-values and to focus on effect size and practical significance rather than just statistical significance.
  3. P-values depend on the statistical model and assumptions: The p-value is based on the assumptions of the statistical model being used, such as normality of the data, independence of observations, and homogeneity of variances. Violations of these assumptions can affect the validity of the p-value, and therefore it is important to also check the assumptions and validity of the statistical model.
  4. P-values do not provide support for the alternative hypothesis: Rejecting the null hypothesis based on a small p-value does not necessarily provide support for the alternative hypothesis. Other factors, such as the plausibility and consistency of the alternative hypothesis, should also be considered.
  5. P-values do not account for multiple comparisons: Conducting multiple statistical tests on the same data can increase the likelihood of obtaining at least one statistically significant result by chance alone, even if all null hypotheses are true. Therefore, it is important to adjust for multiple comparisons, such as by using Bonferroni correction or false discovery rate control.

In summary, p-values can be a useful tool for hypothesis testing, but their interpretation should be considered carefully in conjunction with other statistical measures and the overall context of the research question.

What is statistical significance?

Statistical significance is a term used in statistics to indicate whether the observed differences or relationships in a sample are likely to be a result of chance variation or whether they are likely to represent true differences or relationships in the population from which the sample was drawn.

  • In hypothesis testing, statistical significance is determined by calculating a p-value, which is the probability of obtaining a result as extreme or more extreme than the observed result, assuming that there is no true difference or relationship between the variables being studied (i.e., the null hypothesis is true).
  • If the p-value is smaller than the level of significance (alpha) set for the test (typically 0.05), the result is considered statistically significant, which means that there is strong evidence to reject the null hypothesis and conclude that there is a true difference or relationship in the population. If the p-value is larger than alpha, the result is considered not statistically significant, which means that there is not enough evidence to reject the null hypothesis and conclude that there is a true difference or relationship in the population.
  • It is important to note that statistical significance does not necessarily imply practical significance or importance. A statistically significant result may have little practical significance, and a non-significant result may still be practically significant. Therefore, it is important to interpret statistical significance in the context of the research question and the practical implications of the findings.

Does a p-value tell you whether your alternative hypothesis is true?

No, a p-value alone does not tell you whether your alternative hypothesis is true. A p-value is a statistical measure that indicates the strength of evidence against the null hypothesis, which is the hypothesis that there is no difference or relationship between the variables being studied. A small p-value (typically less than 0.05) suggests that the observed result is unlikely to be due to chance alone, and that there is evidence to reject the null hypothesis in favor of the alternative hypothesis.

However, rejecting the null hypothesis based on a small p-value does not necessarily provide conclusive evidence in favor of the alternative hypothesis. Other factors, such as the plausibility and consistency of the alternative hypothesis, should also be considered when interpreting the results of a hypothesis test.

Furthermore, statistical significance (as determined by the p-value) does not imply practical significance or importance. A statistically significant result may have little practical significance, and a non-significant result may still be practically significant. Therefore, it is important to interpret the results of statistical analyses in the context of the research question and the practical implications of the findings.

Randomization Procedures

Type I and Type II Errors