Bonferroni  Multiple ttests 
Top Up Down
A A 
The exact statement of your null hypothesis determines whether a Bonferroni correction applies. If you have a list of ttests and a significant result for even one of those ttests rejects the nullhypothesis, then Bonferroni correction (or similar).
Let's assume your hypothesis is "this instrument does not exhibit DIF", and you are going to test the hypothesis by looking at the statistical significance probabilities reported for each ttest in a list of ttests. Then, by chance, we would expect 1 out of every 20 or so ttests to report p≤.05. So, if there are more than 20 ttests in the list, then p≤.05 for an individual ttest is a meaningless significance. In fact, if we don't see at least one p≤.05, we may be surprised!
The Bonferroni correction says, "if any of the ttests in the list has p≤.05/(number of ttests in the list), then the hypothesis is rejected".
What is important is the number of tests, not how many of the are reported to have p≤.05.
If you wish to make a Bonferroni multiplesignificancetest correction, compare the reported significance probability with your chosen significance level, e.g., .05, divided by the number of ttests in the Table. According to Bonferroni, if you are testing the null hypothesis at the p≤.05 level: "There is no effect in this test." Then the most significant effect must be p≤.05 / (number of item DIF contrasts) for the null hypothesis of noeffect to be rejected.
Question: Winsteps Tables report many ttests. Should Bonferroni adjustments for multiple comparisons be made?
Reply: It depends on how you are conducting the ttests. For instance, in Table 30.1. If your hypothesis (before examining any data) is "there is no DIF for this CLASS in comparison to that CLASS on this item", then the reported probabilities are correct.
If you have 20 items, then one is expected to fail the p ≤ .05 criterion. So if your hypothesis (before examining any data) is "there is no DIF in this set of items for any CLASS", then adjust individual ttest probabilities accordingly.
In general, we do not consider the rejection of a hypothesis test to be "substantively significant", unless it is both very unlikely (i.e., statistically significant) and reflects a discrepancy large enough to matter (i.e., to change some decision). If so, even if there is only one such result in a large data set, we may want to take action. This is much like sitting on the proverbial needle in a haystack. We take action to remove the needle from the haystack, even though statistical theory says, "given a big enough haystack, there will probably always be a needle in it somewhere."
A strict Bonferroni correction for n multiple significance tests at joint level a is a /n for each single test. This accepts or rejects the entire set of multiple tests. In an example of a 100 item test with 20 bad items (.005 < p < .01), the threshold values for cutoff with p ≤ .05 would be: p ≤ .0.0005, so that the entire set of items is accepted.
Benjamini and Hochberg (1995) suggest that an incremental application of Bonferroni correction overcomes some of its drawbacks. Here is their procedure:
i) Perform the n single significance tests.
ii) Number them in ascending order by probability P(i) where i=1,n in order.
iii) Identify k, the largest value of i for which P(i) ≤ α * i/n where α = .05 or α = .01
iv) Reject the null hypothesis for i = 1, k
In an example of a 100 item test with 20 bad items (with .005 < p < .01), the threshold values for cutoff with α = .05 would be: 0.0005 for the 1st item, .005 for the 10th item, .01 for the 20th item, .015 for the 30th item. So that k would be at least 20 and perhaps more. All 20 bad items have been flagged for rejection.
Benjamini Y. & Hochberg Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society B, 57,1, 289300.
Example of whether to Bonferroni or not ...
Hypothesis 1: There is no DIF between men and women on item 1.
This is tested for item 1 in Table 30.1
or there is no DPF between "addition items" and "subtraction items" for George
This is tested for George in Table 31.1
Hypothesis 2: There is no DIF between men and women on the 8 items of this test.
Bonferroni correction:
Look at the 8 pairwise DIF tests in Table 30.1
Choose the smallest pvalue = p
Divide it by 8 = p/8
If (p/8) ≤.05 then reject hypothesis 2.
Or there is no DPF between "addition items" and "subtractions items" across the 1000 persons in the sample.  Bonferroni applied to Table 31.1.
Question: "Does this mean that if one or a few ttests turn out significant, you should reject the whole set of null hypotheses and you can not tell which items that are DIF?"
Answer: You are combining two different hypotheses. Either you want to test the whole set (hypothesis 2) or individual items (hypothesis 1). In practice, we want to test individual items. So Bonferroni does not apply.
Let's contrast items (each of which is carefully and individually constructed) against a random sample from the population.
We might ask: "Is there Differential Person Functioning by this sample across these two types of items?" (Hypothesis 2  Bonferroni) because we are not interested to (and probably cannot) investigate individuals.
But we (and the lawyers) are always asking "is there Differential Item Functioning on this particular item for men and women?" (Hypothesis 1  not Bonferroni).
Help for Winsteps Rasch Measurement Software: www.winsteps.com. Author: John Michael Linacre
Facets Rasch measurement software $149. Winsteps Rasch measurement software $149. 

Stateoftheart : singleuser and site licenses : free student/evaluation versions : download immediately : instructional PDFs : user forum : assistance by email : bugs fixed fast : free update eligibility : backwards compatible : money back if not satisfied Rasch, Winsteps, Facets online Tutorials 

Forum  Rasch Measurement Forum to discuss any Raschrelated topic 


Rasch Publications  

Rasch Measurement Transactions (free, online)  Rasch Measurement research papers (free, online)  Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch 
Applying the Rasch Model 2nd. Ed., Bond & Fox (Winsteps)  Best Test Design, Wright & Stone  Rating Scale Analysis, Wright & Masters 
Rasch Analysis in the Human Sciences, W. Boone, J. Staver, M. Yale  Introduction to ManyFacet Rasch Measurement, Thomas Eckes (Facets)  Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr. (Facets) 
Statistical Analyses for Language Testers, Rita Green (Winsteps, Facets)  Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar  Journal of Applied Measurement 
Winsteps Tutorials  Facets Tutorials  Rasch Discussion Groups 
Coming Raschrelated Events  

April 26, 2014, Sat.  Submission deadline: PROMS2014, Guangzhou, China: Aug. 26, 2014, Sat.Wed., www.confchina.com 
May 2, 2014, Fri.  ORVMS: Ohio River Valley Objective Measurement Seminar, Cincinnati OH, Announcement 
May 1416, 2014, Wed.Fri.  Inperson workshop: Introductory Rasch (A. Tennant, RUMM), Leeds, UK, www.leeds.ac.uk/medicine/rehabmed/psychometric 
May 1921, 2014, Mon.Wed.  Inperson workshop: Intermediate Rasch (A. Tennant, RUMM), Leeds, UK, www.leeds.ac.uk/medicine/rehabmed/psychometric 
May 30  June 27, 2014, Fri.Fri.  Online workshop: Practical Rasch Measurement  Core Topics (E. Smith, Winsteps), www.statistics.com 
July 4  Aug. 1, 2014, Fri.Fri.  Online workshop: Practical Rasch Measurement  Further Topics (E. Smith, Winsteps), www.statistics.com 
July 25, 2014, Fri..  Inperson workshop: Measuring Rehabilitation Outcomes in Older Adults, Chicago, www.rehabmeasures.org 
July 28  Nov. 22, 2014, Mon.Sat..  Online course: Introduction to Rasch Measurement Theory (D. Andrich, I. Marais) www.education.uwa.edu.au/ppl/courses 
Aug. 26, 2014, Sat.Wed.  PROMS2014, Guangzhou, China: Sat.Sun. workshops; Mon.Wed. symposium, www.confchina.com 
Aug. 8  Sept. 5, 2014, Fri.Fri.  Online workshop: ManyFacet Rasch Measurement (E. Smith, Facets), www.statistics.com 
Sept. 35, 2014, Wed.Fri.  IMEKO International Measurement Confederation Symposium, Madeira Island, Portugal, www.imekotc72014.pt 
Sept. 1012, 2014, Wed.Fri.  Inperson workshop: Introductory Rasch (A. Tennant, RUMM), Leeds, UK, www.leeds.ac.uk/medicine/rehabmed/psychometric 
Sept. 12  Oct. 10, 2014, Fri.Fri.  Online workshop: Rasch Applications, Part I: How to Construct a Rasch Scale (W.P. Fisher), www.statistics.com 
Sept. 1517, 2014, Mon.Wed.  Inperson workshop: Intermediate Rasch (A. Tennant, RUMM), Leeds, UK, www.leeds.ac.uk/medicine/rehabmed/psychometric 
Sept. 1819, 2014, Thurs.Fri.  Inperson workshop: Advanced Rasch (A. Tennant, RUMM), Leeds, UK, www.leeds.ac.uk/medicine/rehabmed/psychometric 
Oct. 810, 2014, Wed.Fri.  IACAT Conference: International Association of Computerized Adaptive Testing, Princeton, NJ, iacat.org/conference 
Oct. 17  Nov. 14, 2014, Fri.Fri.  Online workshop: Practical Rasch Measurement  Core Topics (E. Smith, Winsteps), www.statistics.com 
Dec. 35, 2014, Wed.Fri.  Inperson workshop: Introductory Rasch (A. Tennant, RUMM), Leeds, UK, www.leeds.ac.uk/medicine/rehabmed/psychometric 
Jan. 1214, 2015, Mon.Wed.  6th Rasch Conference: Sixth International Conference on Probabilistic Models for Measurement in Education, Psychology, Social Science and Health, Cape Town, South Africa www.rasch.co.za/conference.php 
April 1620, 2015, Thurs.Mon.  AERA Annual Meeting, Chicago IL www.aera.net 
The javascript to add "Coming Raschrelated Events" to your webpage is: <script type="text/javascript" src="http://www.rasch.org/events.txt"></script> 