Type I and Type II Error
You’ll remember that Type II error is the probability of accepting the null hypothesis (or in other words “failing to reject the null hypothesis”) when we actually should have rejected it. This probability is signified by the letter β. In contrast, rejecting the null hypothesis when we really shouldn’t have is type I error and signified by α. In this video, you’ll see pictorially where these values are on a drawing of the two distributions of H0 being true and HAlt being true.
- Type I error (α): we incorrectly reject H0 even though the null hypothesis is true.
- Type II error (β): we incorrectly accept (or “fail to reject”) H0 even though the alternative hypothesis is true.
An Error Mnemonic
Alternate hypothesis (Ha): there is a wolf
Null hypothesis (H0): there is no wolf

- Type I error (α): we incorrectly reject the null hypothesis, that there isn’t a wolf (i.e., we believe there is a wolf), even though the null hypothesis is true (there is no wolf).
- Type II error (β): we incorrectly accept (or “fail to reject”) the null hypothesis (there is no wolf) even though the alternative hypothesis is true (there is a wolf).
Statistical Power
The power of a test is the probability that the test will reject the null hypothesis when the alternative hypothesis is true. In other words, the probability of not making a Type II error. In other other words, what is the power of our test to determine a difference between two populations (H0 and HA) if such a difference exists?
- Power (1-β): the probability correctly rejecting the null hypothesis (when the null hypothesis isn’t true).
- Type II error (β): the probability of failing to rejecting the null hypothesis (when the null hypothesis is not true).
There are four interrelated components of power:
- B: beta (β), since power is 1-β
- E: effect size, the difference between the means of the sampling distributions of H0 and HAlt. The greater the difference between these two means, the more power your test will have to detect a difference. This is mathematically written as a normalized difference (d) between the means of the two populations. d = (μ1-μ0)/σ.
- A: alpha (α), the significance value which is typically set at 0.05, this is the cut off at which we accept or reject our null hypothesis. Making α smaller (α = 0.1) makes it harder to reject the H0. This makes power smaller.
- N: sample size (n). The larger you make the population, the smaller the standard error becomes (SE = σ/√n). Basically it makes the sample distribution more narrow and therefore making β smaller.
It really helps to see these graphically in the video. Try drawing out examples of each how changing each component changes power till you get it and feel free to ask questions (in the comments or by email).
Clinical versus Statistical Significance
Clinical significance is different from statistical significance. A difference between means, or a treatment effect, may be statistically significant but not clinically meaningful. For example, if the sample size is big enough, very small differences may be statistically significant (e.g. One pound change in weight, 1 mmHg of blood pressure) even though they will have no real impact on patient outcomes. So it is important to pay attention to clinical significance as well as statistical significance when assessing study results. Clinical significance is determined using clinical judgment as well as results of other studies which demonstrate the downstream clinical impact of shorter-term study outcomes.
Test your comprehension
With this problem set on power.