Subsets and connection ambiguities in the data

Top  Up  Down  A A

Connectivity (or subsetting) is a concern in any data analysis involving missing data. In general,

nested data are not connected.

fully-crossed data (also called "complete data") are connected.

partially-crossed data may or may not be connected.

 

Winsteps examines the responses strings for all the persons. It verifies that every non-extreme response string is linked into one network of success and failure on the items. Similarly, the strings of responses to the items are linked into one network of success and failure by the persons.

 

If person response string A has a success on item 1 and a failure on item 2, and response string B has a failure on item 1 and a success on item 2, then A and B are connected. This examination is repeated for all pairs of response strings and all pairs of items. Gradually all the persons are connected with all the other persons, and all the items are connected with all the other items. But it some persons or some items cannot be connected in this way, then Winsteps reports a "connectivity" problem, and reports which subsets of items and persons are connected.

 

Example:

Dataset 1. The Russian students take the Russian items. This is connected. All the data are in one subset.

Dataset 2. The American students take the American items. This is connected. All the data are in one subset.

Dataset 3. Datasets 1 and 2 are put into one analysis. This is not connected. The data form two subsets: the Russian one and the American one. The raw scores or Rasch measures of the Russian students cannot be compared to those of the American students. For instance, if the Russian students score higher than the American students, are the Russian students more able or are the Russian items easier? The data cannot tell us which is true.

 

Winsteps attempts to estimate an individual measure for each person and item within one frame of reference. Usually this happens. But there are exceptions. The data may not be "well-conditioned" (Fischer G.H., Molenaar, I.W. (eds.) (1995) Rasch models: foundations, recent developments, and applications. New York: Springer-Verlag. p. 41-43).

 

Extreme scores (zero, minimum possible and perfect, maximum possible scores) imply measures that our beyond the current frame of reference. Winsteps uses Bayesian logic to provide measures corresponding to those scores.

 

More awkward situations are shown in this dataset. It is Examsubs.txt.

 

Title = "Example of subset reporting"

Name1 = 1

Item1 = 10

NI = 10

&End

Extreme ; item labels

Subset 1

Subset 1

Subset 2

Subset 2

Guttman 6

Subset 4

Subset 4

Subset 5

Subset 5

END LABELS

Extreme  100000      

Subset 1 101001      

Subset 1 110001      

Subset 2 111011      

Subset 2 111101      

Guttman3      011

Subset 4      001    

Subset 4      010    

Subset 5         01  

Subset 5         10

 

The Iteration Screen (Table 0) reports:

 

Checking connectivity ...

>=====================================<

WARNING: DATA ARE AMBIGUOUSLY CONNECTED INTO 6 SUBSETS. MEASURES ACROSS SUBSETS ARE NOT COMPARABLE

SUBSET: 1

ITEM: 2-3

PERSON: 2-3

SUBSET 1 OF 2 ITEMS AND 2 PERSONS

SUBSET: 2

ITEM: 4-5

PERSON: 4-5

SUBSET 2 OF 2 ITEMS AND 2 PERSONS

SUBSET: 3

PERSON: 6

GUTTMAN SUBSET 3 OF 1 PERSONS

SUBSET: 4

ITEM: 7-8

PERSON: 7-8

SUBSET 4 OF 2 ITEMS AND 2 PERSONS

SUBSET: 5

ITEM: 9-10

PERSON: 9-10

SUBSET 5 OF 2 ITEMS AND 2 PERSONS

SUBSET: 6

ITEM: 6

GUTTMAN SUBSET 6 OF 1 ITEMS

 

There are 10 items. The first item "Extreme" is answered correctly by all who responded to it. So it is estimated as extreme and dropped from further analysis. Then the first person "Extreme" responds incorrectly to all non-extreme items and is dropped.

After eliminating Item 1 and Person 1,

Subset 6: Item 6 "Guttman" has a Guttman pattern. It distinguishes between those who succeeded on it from those who failed, with no contradiction to that distinction in the data. So there is an unknown logit distance between those who succeeded on Item 6 and those who failed on it. Consequently the difficulty of Item 6 is uncertain, similarly for Person 6 in Subset 3.

The remaining subsets have measures that can be estimated within the subset, but have unknown distance from the persons and items in the other subsets.

 

Under these circumstance, Winsteps reports one of an infinite number of possible solutions. Measures cannot be compared across subsets. Fit statistics and standard errors are usually correct. Reliability coefficients are accidental and so is Table 20, the score-to-measure Table. Measure comparisons within subsets are correct. Across-subset measure comparisons are accidental.

 

The subsets are shown in the Measure Tables:

subset1

 

A solution would be to anchor two equivalent items (or two equivalent persons) in the different subsets to the same values - or juggle the anchor values to make the mean of each subset the same (or whatever). Or else do separate analyses. Or construct a real or dummy data records which include 0 & 1 responses to all items.

 

Winsteps reports entry numbers for each person and each item in each subset, so that you can compare their response strings. To analyze only the items and persons in a particular subset, such as subset 4 above, specify:

IDELETE= +9-10

PDELETE= +9-10

 

 

 

 

T

 

 

 


Help for WINSTEPS® Rasch Measurement Software: www.winsteps.com.