Validity investigation |
Question: I really want to you to help me in a simple explanation understand how in practice can I go about collecting validity evidences via Rasch Analysis with Winsteps to support use and inference of my test?
Answer: There are many types of validity described in the literature, but they summarize to two main topics:
1. Content validity: does the test measure what it is intended to measure? Content validity is determined by content experts. If a panel of experts is used, then Winsteps can be used to analyze their content-relevance ratings of each item - a more sophisticated approach than Lawshe, Charles H. (1975). "A Quantitative Approach to Content Validity".
2. Construct validity: does the hierarchy of item difficulties accord with the construct theory underlying the items? For instance, are the "division" items harder than the "addition" items in general?
3. Predictive validity: does the test produce measures which correspond to what we know about the persons? Do children in higher grades have higher measures (thetas)?
Investigation of these validities is performed directly by inspection of the results of the analysis (Rasch or Classical or ...), or indirectly through correlations of the Rasch measures (or raw scores, etc.) with other numbers which are thought to be good indicators of what we want.
Question: That is what exactly type of validity questions should I ask and how can I answer them using Rasch analysis?
Answer: 1. Construct validity: we need a "construct theory" (i.e., some idea about our latent variable) - we need to state explicitly, before we do our analysis, what will be a more-difficult item, and what will be a less-difficult item.
Certainly we can all do that with arithmetic items: 2+2=? is easy. 567856+97765=? is hard.
If the Table 1 item map agrees with your statement. Then the test has "construct validity". It is measuring what you intended to measure.
2. Predictive validity: we need to we need to state explicitly, before we do our analysis, what will be a the characteristics of a person with a higher measure, and what will be the characteristics of a person with a lower measure. And preferably code these into the person labels in our Winsteps control file.
For arithmetic, we expect older children, children in higher grades, children with better nutrition, children with fewer developmental or discipline problems, etc. to have higher measures. And the reverse for lower measures.
If the Table 1 person map agrees with your statement. Then the test has "predictive validity". It is "predicting" what we expected it to predict. (In statistics, "predict" doesn't mean "predict the future", "predict" means predict some numbers obtained by other means.
Question: More specifically, in using, for example, DIF analysis via Winsteps what type of validity question I am trying to answer?
Answer: 1. Construct validity: DIF implies that the item difficulty is different for different groups. The meaning of the construct has changed! Perhaps the differences are too small to matter. Perhaps omitting the DIF item will solve the problem. Perhaps making the DIF item into two items will solve the problem.
For instance, questions about "snow" change their difficulty. In polar countries they are easy. In tropical countries they are difficult. When we discover this DIF, we would define this as two different items, and so maintain the integrity of the "weather-knowledge" construct.
2. Predictive validity: DIF implies that the predictions made for one group of persons, based on their measures, differs from the predictions made for another group. Do the differences matter? Do we need separate measurement systems? ...
Question: Similarly, in using Fit statistics, dimensionality, and order of item difficulty what type of validity questions I am attempting to answer via Winsteps?
Answer: They are the same questions every time. Construct Validity and Predictive Validity. Is there a threat to validity? Is it big enough to matter in a practical way? What is the most effective way of lessening or eliminating the threat?
Question: I used the numbers in my Winsteps output to prove the Validity of my instrument, but a reviewer says that is not enough.
Answer: Your Validity seems to be relating only to statistical validity. Generally speaking this is of lower concern than:
1. Construct/Content validity - is the instrument measuring what it is intended to measure: e.g., Are these arithmetic items? Does their difficulty order agree with the construct theory about which arithmetic items are easier (one digit addition) and which are harder (long division)? You may need a content expert to assist with this.
2. Predictive validity - do the measures make sense with our experience of people whom we perceive to have more and less of what we intend to measure? For instance, with increasing elementary-school grade-levels do person (children) measures (thetas) increase on average? We may tie this to the results of another accepted instrument = concurrent validity.
3. If we satisfy (1) and (2), we can then proceed to the type of fine-tuning that you are discussing: Statistical validity
Are there off-dimensional, ambiguous, duplicative, etc., items that should be dropped or rewritten?
Are there items that have DIF, e.g., gender DIF: an arithmetic item that references cooking or carpentry?
Then we need to decide how many levels of competence the instrument is intended to detect. This ties in with the Test (=Person Sample) reliability. If we only need to separate high performers from low performers, then a reliability of 0.8 is enough. High-middle-low we need 0.9. More levels we need to go further toward 1.0.
Help for Winsteps Rasch Measurement and Rasch Analysis Software: www.winsteps.com. Author: John Michael Linacre
Facets Rasch measurement software.
Buy for $149. & site licenses.
Freeware student/evaluation Minifac download Winsteps Rasch measurement software. Buy for $149. & site licenses. Freeware student/evaluation Ministep download |
---|
Forum: | Rasch Measurement Forum to discuss any Rasch-related topic |
---|
Questions, Suggestions? Want to update Winsteps or Facets? Please email Mike Linacre, author of Winsteps mike@winsteps.com |
---|
State-of-the-art : single-user and site licenses : free student/evaluation versions : download immediately : instructional PDFs : user forum : assistance by email : bugs fixed fast : free update eligibility : backwards compatible : money back if not satisfied Rasch, Winsteps, Facets online Tutorials |
---|
Coming Rasch-related Events | |
---|---|
May 17 - June 21, 2024, Fri.-Fri. | On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
June 12 - 14, 2024, Wed.-Fri. | 1st Scandinavian Applied Measurement Conference, Kristianstad University, Kristianstad, Sweden http://www.hkr.se/samc2024 |
June 21 - July 19, 2024, Fri.-Fri. | On-line workshop: Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com |
Aug. 5 - Aug. 6, 2024, Fri.-Fri. | 2024 Inaugural Conference of the Society for the Study of Measurement (Berkeley, CA), Call for Proposals |
Aug. 9 - Sept. 6, 2024, Fri.-Fri. | On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com |
Oct. 4 - Nov. 8, 2024, Fri.-Fri. | On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
Jan. 17 - Feb. 21, 2025, Fri.-Fri. | On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
May 16 - June 20, 2025, Fri.-Fri. | On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
June 20 - July 18, 2025, Fri.-Fri. | On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com |
Oct. 3 - Nov. 7, 2025, Fri.-Fri. | On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
Our current URL is www.winsteps.com
Winsteps® is a registered trademark