﻿ Dimensionality: when is a test multidimensional?

# Dimensionality: when is a test multidimensional?

For more discussion see dimensionality and contrasts.

Beware of:

1.Accidents in the data generating spurious dimensions

2.Content strands within a bigger content area ("addition" and "subtraction" within "arithmetic") generating generally inconsequential dimensions.

3.Demographic groups within the person sample differential influencing item difficulty, so making a reference-group item dimension and a focal-group item dimension (native and second-language speakers on a language test).

"Variance explained" depends on the spread of the item and person measures. Please see http://www.rasch.org/rmt/rmt221j.htm - For dimensionality analysis, we are concerned about the "Variance explained by the first contrast in the residuals". If this is big, then there is a second dimension at work. Infit and Outfit statistics are too local (one item or one person at a time) to detect multidimensionality productively. They are too much influenced by accidents in the data (e.g., guessing, response sets), and generally do not detect the more subtle, but pervasive, impact of a second dimension (unless it is huge).

Question: "I can not understand the residual contrast analysis you explained. For example, in Winsteps, it gave me the five contrasts' eigenvalues: 3.1, 2.4, 1.9, 1.6, 1.4. (I have 26 items in this data). The result is the same as when I put the residuals into SPSS."

Reply: Unidimensionality is never perfect. It is always approximate. The Rasch model constructs from the data parameter estimates along the unidimensional latent variable that best concurs with the data. But, though the Rasch measures are always unidimensional and additive, their concurrence with the data is never perfect. Imperfection results from multi-dimensionality in the data and other causes of misfit.

Multidimensionality always exists to a lesser or greater extent. The vital question is: "Is the multi-dimensionality in the data big enough to merit dividing the items into separate tests, or constructing new tests, one for each dimension?"

The unexplained variance in a data set is the variance of the residuals. Each item is modeled to contribute 1 unit of information (= 1 eigenvalue) to the principal components decomposition of residuals. So the eigenvalue of the total unexplained variance is the number of items (less any items with extreme scores). So when a component (contrast) in the decomposition is of size 3.1, it has the information (residual variance) of about 3 items.

In your example, the first contrast has eigenvalue of 3.1. Its expected value is near 2.0 - www.rasch.org/rmt/rmt191h.htm. This means that the contrast between the strongly positively loading items and the strongly negatively loading items on the first contrast in the residuals has the strength of about 3 items. Since positive and negative loading is arbitrary, you must look at the items at the top and the bottom of the contrast plot. Are those items substantively different? Are they so different that they merit the construction of two separate tests?

It may be that two or three off-dimension items have been included in your 26 item instrument and should be dropped. But this is unusual for a carefully developed instrument. It is more likely that you have a "fuzzy" or "broad" dimension, like mathematics. Mathematics includes arithmetic, algebra, geometry and word problems. Sometimes we want a "geometry test". But, for most purposes, we want a "math test".

If in doubt,

1.Split your 26 items into clusters (subtests), based on positive and negative loadings on the first residual contrast. Winsteps does this in Table 23.1 with "cluster" numbers.

2.Measure everyone on each of the clusters. Table 23.6

3.What is the correlation of the person measures for pairs of clusters? Table 23.1

4.Do the clusters display two versions of the same story about the persons, or are they different stories?
"The correlation coefficient corrected for attenuation between two tests x and y is the correlation between their true scores [or true measures]. If, on the basis of a sample of examinees, the corrected coefficient is near unity, the experimenter concludes that the two tests are measuring the same trait." (p. 117) in Joreskog, K.G. (1971) Statistical analysis of sets of congeneric tests, Psychometrica 36, 109-133.

5.Copy the person measures from Table 23.6 into Excel, and cross-plot the numbers. Which people are off-diagonal? Is that important? If only a few people are noticeably off-diagonal, or the off-diagonal deviance would not lead to any action, then you have a substantively unidimensional test. You may have a "Fahrenheit-Celsius" equating situation if the best fit line on the plot departs from a unit slope.

You can do a similar investigation for the second contrast of size 2.4, and third of size 1.9, but each time the motivation for doing more than dropping an off-dimension item or two becomes weaker. Since random data can have eigenvalues of size 1.4, there is little motivation to look at your 5th contrast.

Question: Why are my adaptive-test observations always reported to be unidimensional?

Reply: Unobserved data are modeled at their Rasch-predicted values. In an adaptive test these overwhelm the observed data, so the data are reported as unidimensional using standard criteria.

Solution: use the Winsteps "Simulate data" function (Complete Data - No) to obtain a baseline for the unidimensional eigenvalues for your data. These will be much lower than the standard criterion of 2.

"Variance explained" is a newly developing area in Rasch methodology. We learn something new about it every month or so. Perhaps you will contribute to this. So there are no rules, only tentative guidelines based on the current state of theory and empirical experience.

1. Originally Winsteps implemented 3 algorithms for computing variance-explained. Most people used the default algorithm (based on standardized residuals). User experience indicates that one of the other two algorithms was much more accurate in apportioning explained and unexplained variance. So, in the current version of Winsteps, this other algorithm (based on raw residuals) had become the algorithm for this part of the computation. The three algorithms are still implemented for the decomposition of the unexplained variance into contrasts (raw residuals, standardized residuals and logit residuals), and the default remains the standardized residuals for this part of the computation.

2. www.rasch.org/rmt/rmt221j.htm shows the expected decomposition of raw variance into explained variance and unexplained variance under different conditions.

Since the rules are only guidelines, please always verify their applicability in your particular situation. A meaningful way of doing this is to compute the person measures for each of what might be the biggest two dimensions in the data, and then to cross-plot those measures. Are the differences between the measures big enough, and pervasive enough, to be classified as "two dimensions" (and perhaps reported separately) or are they merely a minor perturbation in the data. For instance, in arithmetic, word-problems, abstract-problems and concrete-problems have different response profiles (and so noticeable contrasts), but they are rarely treated as different "dimensions".

Help for Winsteps Rasch Measurement Software: www.winsteps.com. Author: John Michael Linacre

The Languages of Love: draw a map of yours!

 Forum Rasch Measurement Forum to discuss any Rasch-related topic

Rasch Publications
Rasch Measurement Transactions (free, online) Rasch Measurement research papers (free, online) Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Applying the Rasch Model 3rd. Ed., Bond & Fox Best Test Design, Wright & Stone
Rating Scale Analysis, Wright & Masters Introduction to Rasch Measurement, E. Smith & R. Smith Introduction to Many-Facet Rasch Measurement, Thomas Eckes Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments, George Engelhard, Jr. & Stefanie Wind Statistical Analyses for Language Testers, Rita Green
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Journal of Applied Measurement Rasch models for measurement, David Andrich Constructing Measures, Mark Wilson Rasch Analysis in the Human Sciences, Boone, Stave, Yale
in Spanish: Análisis de Rasch para todos, Agustín Tristán Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez
Winsteps Tutorials Facets Tutorials Rasch Discussion Groups

Coming Winsteps & Facets Events
May 22 - 24, 2018, Tues.-Thur. EALTA 2018 pre-conference workshop (Introduction to Rasch measurement using WINSTEPS and FACETS, Thomas Eckes & Frank Weiss-Motz), https://ealta2018.testdaf.de
May 25 - June 22, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 27 - 29, 2018, Wed.-Fri. Measurement at the Crossroads: History, philosophy and sociology of measurement, Paris, France., https://measurement2018.sciencesconf.org
June 29 - July 27, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
July 25 - July 27, 2018, Wed.-Fri. Pacific-Rim Objective Measurement Symposium (PROMS), (Preconference workshops July 23-24, 2018) Fudan University, Shanghai, China "Applying Rasch Measurement in Language Assessment and across the Human Sciences" www.promsociety.org
Aug. 10 - Sept. 7, 2018, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
Oct. 12 - Nov. 9, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com

Our current URL is www.winsteps.com