Missing data

One of Ben Wright's requirements for valid measurement, derived from the work of L.L. Thurstone, is that "Missing data must not matter." Of course, missing data always matters in the sense that it lessens the amount of statistical information available for the construction and quality-control of measures. Further, if the missing data, intentionally or unintentionally, skew the measures (e.g., incorrect answers are coded as "missing responses"), then missing data definitely do matter. But generally, missing data are missing essentially at random (by design or accident) or in some way that will have minimal impact on the estimated measures (e.g., adaptive tests).

Winsteps does not require complete data in order to make estimates. One reason that Winsteps uses JMLE is that it is very flexible as regards estimable data structures. For each parameter (person, item or Rasch-Andrich threshold) there are sufficient statistics: the marginal raw scores and counts of the non-missing observations. During Winsteps estimation, the observed marginal counts and the observed and expected marginal scores are computed from the same set of non-missing observations. Missing data are skipped over in these additions. When required, Winsteps can compute an expected value for every observation (present or missing) for which the item and person estimates are known.

The basic estimation algorithm used by Winsteps is:

Improved parameter estimate = current parameter estimate

+ (observed marginal score - expected marginal score) / (modeled variance of the expected marginal score)

The observed and expected marginal scores are obtained by summing across the non-missing data. The expected score and its variance are obtained by Rasch estimation using the current set of parameter estimates, see RSA.

If data are missing, or observations are made, in such a way that measures cannot be constructed unambiguously in one frame of reference, then the message

WARNING: DATA MAY BE AMBIGUOUSLY CONNECTED INTO nnn SUBSETS

is displayed on the Iteration screen to warn of ambiguous connection.

Missing data in Tables 23, 24: Principal Components Analysis.

For raw observations, missing data are treated as missing. Pairwise deletion is used during the correlation computations.

For residuals, missing data are treated as 0, their expected values. This attenuates the contrasts, but makes them estimable.

You can try different methods for missing data by writing an IPMATRIX= of the raw data to a file, and then using your own statistical software to analyze.

Example 1: Missing observations are scored "1"

ni=4

codes=01

name1=1

item1=1

codes = 01A

ptbis=YES

misscore=1

&end

END LABELS

0110

1001

A001 ; A is in Codes= but scored "missing"

B101 ; B is not in Codes= but is scored 1 by MISSCORE=1

---------------------------------------------------------------------

|--------------------+------------+--------------------------+------| Code with Score

| 1 A *** | 1 25*| -.70 -.50 |I0001 | A {0,0,1,-}{2,1,1,2}

| 0 0 | 1 33 | -.01 .7 1.00 | | 0 {1,0,-,-}{2,1,1,2}

| 1 1 | 1 33 | -.01* 1.4 -1.00 | | 1 {0,1,-,-}{2.1,1,2}

| MISSING 1 | 1 33 | 1.26 .4 .50 | | B {0,0,-,1}{2,1,1,2}

Example 2: Missing data: two types: "skipped" and "not reached"

Much more at www.rasch.org/rmt/rmt142h.htm

Missing responses in a dataset do not all have the same meaning. For instance, in a timed multiple-choice test. Missing responses between observed responses may mean "skipped" - the respondent decided this question was too hard. Missing responses at the end of the test can mean "not reached", because time ran out before the respondent could respond to these items.

Solution: enter two different missing data codes in the data file, for instance, "S" for skipped and "R" for not reached. Then,

when calibrating the items, we want to ignore "not reached" responses, but score skipped responses as wrong:

CODES=01S

NEWSCORE=010 ; S is scored 0

MISSING-SCORED= -1 ; data code R is not in CODES= so it will be scored -1 = "ignore", "not administered"

IFILE= item-calibrations.txt

when we measure the persons, skipped and not reached responses are wrong:

IAFILE= item-calibrations.txt ; anchor the item difficulties at their good calibrations

CODES=01SR

NEWSCORE=0100 ; S and R are scored 0

Example 3: Missing categories in a rating scale

This is a different type of missing data.

Unobserved intermediate categories:

Keep them in the rating scale: STKEEP=yes

Recount the rating scale categories without them: STKEEP=No

Unobserved extreme (top or bottom) categories:

The bottom category has not been observed for one of your items, so Winsteps has omitted it. For Winsteps to include it, we need to tell Winsteps about it. Let's say the errant item is item 6, and the rating scale categories are 1,2,3,4,5. Here are two simple methods:

1. You may be using the Partial Credit Model, PCM, in which each item is modeled to have its own rating scale. If so, put the errant item and another item(s) like it in the same item group:

Now: ISGROUPS=0

Becomes: ISGROUPS=000001100000000

So all items are PCM except the two (items 6 and 7) that share the same rating scale

If the Likert rating scales are similar for many the items, consider grouping similar items together. This will simplify the analysis, and make communicating your findings to your audience easier.

2. Add a dummy person to the dataset who is observed in the bottom category of item 6 and a non-bottom category of another item. You can give this person a small weight if you don't want the person to change things.

Let's put the dummy person as the first person in the data file:

Dummy person xxxxx12xxxxxxxx ; x is missing data

and in your control file

PWEIGHT=*

1 .001 ; person 1 given a small weight

There are other options, but they are more complicated.

Help for Winsteps Rasch Measurement and Rasch Analysis Software: www.winsteps.com. Author: John Michael Linacre

Rasch Books and Publications
Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, 2nd Edn, 2024 George Engelhard, Jr. & Jue Wang	Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene	Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver	Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone	Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale
Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes	Statistical Analyses for Language Testers (Facets), Rita Green	Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind	Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M	Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland
Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind	Rasch Measurement: Applications, Khine	Winsteps Tutorials - free Facets Tutorials - free	Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre	Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan
Other Rasch-Related Resources: Rasch Measurement YouTube Channel
Rasch Measurement Transactions & Rasch Measurement research papers - free	An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse	Rasch Measurement Theory Analysis in R, Wind, Hua	Applying the Rasch Model in Social Sciences Using R, Lamprianou	El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar	Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch	Rasch Models for Measurement, David Andrich	Constructing Measures, Mark Wilson	Best Test Design - free, Wright & Stone Rating Scale Analysis - free, Wright & Masters
Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias	Diseño de Mejores Pruebas - free, Spanish Best Test Design	A Course in Rasch Measurement Theory, Andrich, Marais	Rasch Models in Health, Christensen, Kreiner, Mesba	Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen
As an Amazon Associate I earn from qualifying purchases. This does not change what you pay.

Coming Rasch-related Events
Jan. 17 - Feb. 21, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Feb. - June, 2025	On-line course: Introduction to Classical Test and Rasch Measurement Theories (D. Andrich, I. Marais, RUMM2030), University of Western Australia
Feb. - June, 2025	On-line course: Advanced Course in Rasch Measurement Theory (D. Andrich, I. Marais, RUMM2030), University of Western Australia
Apr. 21 - 22, 2025, Mon.-Tue.	International Objective Measurement Workshop (IOMW) - Boulder, CO, www.iomw.net
May 16 - June 20, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 20 - July 18, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com
Oct. 3 - Nov. 7, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com

Missing data

Questions, Suggestions? Want to update Winsteps or Facets? Please email Mike Linacre, author of Winsteps mike@winsteps.com