Skip to content

Statistical Mistakes: A Deeper Dive

This week's reading contained several examples of statistical mistakes that are commonly made. While many of these are made by accident, it is useful to think about who stands to benefit from these mistakes, and why they are being made. Here are some questions to consider when encountering these mistakes.

1. Adequate Control Condition/Group

  • If there is no control group, why? Not possible? Not necessary?
  • Is the control group adequately powered to detect a difference from the treatment group?
  • Are there any biases introduced as a result of the assignment to control vs. treatment group?

2. Direct Comparisons between Two Effects

  • Was a statistical test used to directly compare results between two groups?
  • Which statistical tests are used for the main conclusions of the study?

3. Inflating the Units of Analysis

  • What are the observational units on which the analysis is being performed?
  • What are your statistical tests actually testing?
  • Are there any "clustering" variables that are more similar to each other?

4. Spurious Correlations

  • Were all variables checked for outliers?
  • Were all analyses examined for confounding and effect modification, as necessary?
  • What visual methods were performed in addition to statistical tests?

5. Use of Small Samples

  • Is the sample size large enough for you to be confident in your conclusions?
  • Have you reported confidence intervals in addition to point estimates?

6. Circular Analysis

  • What analytical decision rules were made after performing preliminary analyses?
  • How was the selection criteria biased in favor of the hypothesis being tested?
  • A commentary from asthma research.

7. P-Hacking

  • Is your study confirming a hypothesis, or exploratory?
  • Did the specific outcome and/or exposure of interest change during the analysis?

8. Multiple Comparisons

  • How many different statistical tests were examined during your analysis?
  • Did you use an omnibus (vs. multiple comparisons) test whenever possible?
  • What if you had 5 different comparisons and only one was significant?
  • Would you trust this method if you reviewed this paper?

9. Over-Interpreting Non-Significant Results

  • At any point, was a non-significant effect interpreted as a lack/absence of an effect?
  • Are effect sizes and measures of uncertainty reported along with p-values?

10. Correlation and Causation

  • Does your study design allow for you to establish evidence of causation?