Missing data |
One of Ben Wright's requirements for valid measurement, derived from the work of L.L. Thurstone, is that "Missing data must not matter." Of course, missing data always matters in the sense that it lessens the amount of statistical information available for the construction and quality-control of measures. Further, if the missing data, intentionally or unintentionally, skew the measures (e.g., incorrect answers are coded as "missing responses"), then missing data definitely do matter. But generally, missing data are missing essentially at random (by design or accident) or in some way that will have minimal impact on the estimated measures (e.g., adaptive tests).
Winsteps does not require complete data in order to make estimates. One reason that Winsteps uses JMLE is that it is very flexible as regards estimable data structures. For each parameter (person, item or Rasch-Andrich threshold) there are sufficient statistics: the marginal raw scores and counts of the non-missing observations. During Winsteps estimation, the observed marginal counts and the observed and expected marginal scores are computed from the same set of non-missing observations. Missing data are skipped over in these additions. When required, Winsteps can compute an expected value for every observation (present or missing) for which the item and person estimates are known.
The basic estimation algorithm used by Winsteps is:
Improved parameter estimate = current parameter estimate
+ (observed marginal score - expected marginal score) / (modeled variance of the expected marginal score)
The observed and expected marginal scores are obtained by summing across the non-missing data. The expected score and its variance are obtained by Rasch estimation using the current set of parameter estimates, see RSA.
If data are missing, or observations are made, in such a way that measures cannot be constructed unambiguously in one frame of reference, then the message
WARNING: DATA MAY BE AMBIGUOUSLY CONNECTED INTO nnn SUBSETS
is displayed on the Iteration screen to warn of ambiguous connection.
Missing data in Tables 23, 24: Principal Components Analysis.
For raw observations, missing data are treated as missing. Pairwise deletion is used during the correlation computations.
For residuals, missing data are treated as 0, their expected values. This attenuates the contrasts, but makes them estimable.
You can try different methods for missing data by writing an IPMATRIX= of the raw data to a file, and then using your own statistical software to analyze.
Example 1: Missing observations are scored "1"
ni=4
codes=01
name1=1
item1=1
codes = 01A
ptbis=YES
misscore=1
&end
END LABELS
0110
1001
A001 ; A is in Codes= but scored "missing"
B101 ; B is not in Codes= but is scored 1 by MISSCORE=1
---------------------------------------------------------------------
|ENTRY DATA SCORE | DATA | AVERAGE S.E. OUTF PTBSE| |
|NUMBER CODE VALUE | COUNT % | MEASURE MEAN MNSQ CORR.| ITEM | Correlation:
|--------------------+------------+--------------------------+------| Code with Score
| 1 A *** | 1 25*| -.70 -.50 |I0001 | A {0,0,1,-}{2,1,1,2}
| 0 0 | 1 33 | -.01 .7 1.00 | | 0 {1,0,-,-}{2,1,1,2}
| 1 1 | 1 33 | -.01* 1.4 -1.00 | | 1 {0,1,-,-}{2.1,1,2}
| MISSING 1 | 1 33 | 1.26 .4 .50 | | B {0,0,-,1}{2,1,1,2}
Example 2: Missing data: two types: "skipped" and "not reached"
Much more at www.rasch.org/rmt/rmt142h.htm
Missing responses in a dataset do not all have the same meaning. For instance, in a timed multiple-choice test. Missing responses between observed responses may mean "skipped" - the respondent decided this question was too hard. Missing responses at the end of the test can mean "not reached", because time ran out before the respondent could respond to these items.
Solution: enter two different missing data codes in the data file, for instance, "S" for skipped and "R" for not reached. Then,
when calibrating the items, we want to ignore "not reached" responses, but score skipped responses as wrong:
CODES=01S
NEWSCORE=010 ; S is scored 0
MISSING-SCORED= -1 ; data code R is not in CODES= so it will be scored -1 = "ignore", "not administered"
IFILE= item-calibrations.txt
when we measure the persons, skipped and not reached responses are wrong:
IAFILE= item-calibrations.txt ; anchor the item difficulties at their good calibrations
CODES=01SR
NEWSCORE=0100 ; S and R are scored 0
Example 3: Missing categories in a rating scale
This is a different type of missing data.
Unobserved intermediate categories:
Keep them in the rating scale: STKEEP=yes
Recount the rating scale categories without them: STKEEP=No
Unobserved extreme (top or bottom) categories:
The bottom category has not been observed for one of your items, so Winsteps has omitted it. For Winsteps to include it, we need to tell Winsteps about it. Let's say the errant item is item 6, and the rating scale categories are 1,2,3,4,5. Here are two simple methods:
1. You may be using the Partial Credit Model, PCM, in which each item is modeled to have its own rating scale. If so, put the errant item and another item(s) like it in the same item group:
Now: ISGROUPS=0
Becomes: ISGROUPS=000001100000000
So all items are PCM except the two (items 6 and 7) that share the same rating scale
If the Likert rating scales are similar for many the items, consider grouping similar items together. This will simplify the analysis, and make communicating your findings to your audience easier.
2. Add a dummy person to the dataset who is observed in the bottom category of item 6 and a non-bottom category of another item. You can give this person a small weight if you don't want the person to change things.
Let's put the dummy person as the first person in the data file:
Dummy person xxxxx12xxxxxxxx ; x is missing data
and in your control file
1 .001 ; person 1 given a small weight
*
There are other options, but they are more complicated.
Help for Winsteps Rasch Measurement and Rasch Analysis Software: www.winsteps.com. Author: John Michael Linacre
Facets Rasch measurement software.
Buy for $149. & site licenses.
Freeware student/evaluation Minifac download Winsteps Rasch measurement software. Buy for $149. & site licenses. Freeware student/evaluation Ministep download |
---|
![]() |
Forum: | Rasch Measurement Forum to discuss any Rasch-related topic |
---|
Questions, Suggestions? Want to update Winsteps or Facets? Please email Mike Linacre, author of Winsteps mike@winsteps.com |
---|
State-of-the-art : single-user and site licenses : free student/evaluation versions : download immediately : instructional PDFs : user forum : assistance by email : bugs fixed fast : free update eligibility : backwards compatible : money back if not satisfied Rasch, Winsteps, Facets online Tutorials |
---|
Coming Rasch-related Events | |
---|---|
Jan. 17 - Feb. 21, 2025, Fri.-Fri. | On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
Feb. - June, 2025 | On-line course: Introduction to Classical Test and Rasch Measurement Theories (D. Andrich, I. Marais, RUMM2030), University of Western Australia |
Feb. - June, 2025 | On-line course: Advanced Course in Rasch Measurement Theory (D. Andrich, I. Marais, RUMM2030), University of Western Australia |
Apr. 21 - 22, 2025, Mon.-Tue. | International Objective Measurement Workshop (IOMW) - Boulder, CO, www.iomw.net |
May 16 - June 20, 2025, Fri.-Fri. | On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
June 20 - July 18, 2025, Fri.-Fri. | On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com |
Oct. 3 - Nov. 7, 2025, Fri.-Fri. | On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
Our current URL is www.winsteps.com
Winsteps® is a registered trademark