If large numbers of independent statistical tests are performed on the same sample, the probability of obtaining significant results will increase. That is, some
P values will be less than 0.05 just by chance. This phenomenon results from choosing a significance level of 0.05 which, by definition, results on average in 1 out of 20 comparisons being declared significant by chance alone, even if there is in fact no real difference between the groups. In other words, performing multiple comparisons increases the risk of a type I error. The necessity to correct for multiple comparisons has generally been accepted in the scientific community and can be done by different methods like the Bonferroni correction.
17 Less conservative methods are Holm's correction,
18 Hochberg (1988),
20 Hommel (1988),
21 Benjamini and Hochberg (1995),
22 and Benjamini and Yekutieli (2001).
23 The first four methods control for the family-wise error rate (Family-wise error rate is the probability of one or more false rejections. The term “family” refers to the collection of hypotheses
H1,…..
Hs, which are being considered for joint testing
17. Type I errors can be defined as family-wise error rates
23). Hochberg's and Hommel's methods are best implemented when the hypothesis tests are independent or when they are non-negatively associated.
21 Hommel's method is more robust to decrease a type I error than Hochberg's, but the difference is usually small and the Hochberg
P values are easier to compute. The methods of Benjamini, Hochberg, and Yekutieli control the false discovery rate,
22,23,26 that is, the expected proportion of false positives among the rejected hypotheses. The false discovery rate is a less stringent condition than the family-wise error rate, so these methods are more powerful than the others.
17