Subsets and connection ambiguities in the data

You see: Warning: Data are ambiguously connected into 6 subsets. Measures may not be comparable across subsets.

 

Quick (but arbitrary) solution:  

Let's assume your data are scored 0 or 1. Then:

a. if subsets of items have the same average difficulty

Add two dummy persons with scored response strings:

10101010....

01010101....

 

b. if subsets of persons have the same average ability

Add two dummy items (columns) with scored response strings:

01

10

01

10

..

 

More detail: add to the data file two dummy person records so that all persons and items become directly comparable.

 

Dichotomous data:

Dummy person 1: responses: 010101010...

Dummy person 2: responses: 101010101...

 

This says: "the middle level of performance for all subsets of persons is the same."

 

Rating scale data, where "1" is the lowest category, and "5" is the highest category:

Dummy person 1: responses: 1212121212...

Dummy person 2: responses: 2121212121...

 

This says: "the bottom level of performance for all subsets of persons is the same."

 

If you are concerned the dummy persons or items will skew your statistics, give them very small weights: PWEIGHT= or IWEIGHT=

 

Explanation: Connectivity (or subsetting) is a concern in any data analysis involving missing data. In general,

nested data are not connected.

fully-crossed data (also called "complete data") are connected.

partially-crossed data may or may not be connected.

 

Winsteps examines the responses strings for all the persons. It verifies that every non-extreme response string is linked into one network of success and failure on the items. Similarly, the strings of responses to the items are linked into one network of success and failure by the persons.

 

If person response string A has a success on item 1 and a failure on item 2, and response string B has a failure on item 1 and a success on item 2, then A and B are connected. This examination is repeated for all pairs of response strings and all pairs of items. Gradually all the persons are connected with all the other persons, and all the items are connected with all the other items. But it some persons or some items cannot be connected in this way, then Winsteps reports a "connectivity" problem, and reports which subsets of items and persons are connected.

 

Mathematics: connectivity is part of Graph Theory. The person/item/judge/... parameters of the Rasch model are the vertices and the observations are the edges. In an undirected graph, we need every vertex to be connected directly or indirectly to every other vertex. A connection is established between two vertices when one vertex is observed to have both a higher observation and a lower observation than another vertex in the same context, or when both both vertices have the same intermediate category of a rating scale in the same context.

 

Thus there are two situation for failure to connect:

1) there is no direct or indirect link between two vertices, e.g., two different datasets analyzed together with no common parameters. This is detected by the Winsteps/Facets subset routine.

 

2) the vertices are connected by observations, but the observations do not meet the requirements, e.g., all the person respond to all the items, but half the persons score in the upper half of the rating scale on every item, and the other half of the persons score in the lower half of the rating scale on every item. This is called a "Guttman split" in the data. This is usually obvious in the reported estimates as a big gap on the Wright maps between the two halves of the person distribution.

 

 

Example 1: Connection problems and subsets in the data are shown in this dataset. It is Examsubs.txt.

 

Title = "Example of subset reporting"

Name1 = 1

Namelength = 24 ;  include response string in person label

Item1 = 13

NI = 12

CODES = 0123 ; x is missing data

ISGROUPS = DDDDDDDDDDRR ; items 1-10 are dichotomies; items 11-12 share a rating scale

MUCON = 3 ; Subsetting can cause very slow convergence

TFILE=*

18.1

14.1

0.4

*

&End

01 Subset 1

02 Subset 1

03 Subset 2

04 Subset 2

05 Subset 7

06 Subset 4

07 Subset 4

08 Subset 5

09 Subset 5

10 Subset 5

11 Subset 6

12 Subset 6

END LABELS

01 Extreme  11111       

02 Subset 1 01111       

03 Subset 1 10111       

04 Subset 2 00101       

05 Subset 2 00011       

06 Subset 3     011

07 Subset 3     011

08 Subset 4     001     

09 Subset 4     010     

10 Subset 5        0x1   

11 Subset 5        10x   

12 Subset 5        x10

13 Subset 6           01

14 Subset 6           10

15 Subset 6           23

16 Subset 6           32 

 

The Iteration Screen reports:

                              CONVERGENCE TABLE

 -Control: \HOLDW95\examples\examsubs.txt Output: \examples\ZOU571WS.TXT

 |    PROX          ACTIVE COUNT       EXTREME 5 RANGE      MAX LOGIT CHANGE  |

 | ITERATION  PERSON    ITEM  CATS     PERSON    ITEM      MEASURES  STRUCTURE|

 >=====================================<

 |        1       15      12     8       2.00    1.06       -2.0794           |

 >=====================================<

 |        2       15      12     6       2.38    1.84        2.6539   -1.6094 |

 >=====================================<

 |        3       14      12     6       2.67    1.60        2.7231     .0000 |

 >=====================================<

 |        4       14      12     6       2.68    2.33       -2.3912     .0000 |

 >=====================================<

 |        5       14      12     6       2.97    1.77        2.3246           |

 >=====================================<

 |        6       14      12     6       2.97    2.54       -2.2191           |

 >=====================================<

 |        7       14      12     6       3.22    2.10        2.0372           |

 Probing data connection: to skip out: Ctrl+F - to bypass: subset=no

 Processing unanchored persons ...

 >=====================================<

 Consolidating 9 potential subsets pairwise ...

 >=================================

 Consolidating 9 potential subsets indirectly pairwise ...

 >=====================================<

 Consolidating 8 potential subsets pairwise ...

 >=================================

 Consolidating 7 potential subsets pairwise ...

 >=================================

 Consolidating 7 potential subsets indirectly pairwise ...

 >=====================================<

 Warning: Data are ambiguously connected into 7 subsets. Measures may not be comparable across subsets.

  Subsets details are in Table 0.4

 

Table 18.1

 

         PERSON STATISTICS:  ENTRY ORDER

 

---------  ---------------------------

|ENTRY     |                         |

|NUMBER    | PERSON                  |

|--------  +-------------------------|

|     1    | 01 Extreme  11111       | MAXIMUM MEASURE

                                                < Guttman split here >

|     2    | 02 Subset 1 01111       | SUBSET 1

|     3    | 03 Subset 1 10111       | SUBSET 1

                                                < Guttman split here >

|     4    | 04 Subset 2 00101       | SUBSET 2

|     5    | 05 Subset 2 00011       | SUBSET 2

                                                < Guttman split here >

|     6    | 06 Subset 3     011     | SUBSET 3

|     7    | 07 Subset 3     011     | SUBSET 3

                                                < Guttman split here >

|     8    | 07 Subset 4     001     | SUBSET 4

|     9    | 09 Subset 4     010     | SUBSET 4

                                                < Subset split here >

|    10    | 10 Subset 5        0x1  | SUBSET 5

|    11    | 11 Subset 5        10x  | SUBSET 5 < Indirect connection >

|    12    | 12 Subset 5        x10  | SUBSET 5

                                                < Subset split here>

|    13    | 13 Subset 6           01| SUBSET 6

|    14    | 14 Subset 6           10| SUBSET 6

                                                < undetected Guttman split here: Winsteps failed! >

|    15    | 15 Subset 6           23| SUBSET 6

|    16    | 16 Subset 6           32| SUBSET 6

|--------  +-------------------------|

 

In Tables and Notes:

Explanation:

< Guttman split here >

The persons above the split performed an unknowable amount different from the persons below the split. There is no item on which this subset succeeded and another subset failed, and also this subset failed and the other subset succeeded. The data are not "well-conditioned" (Fischer G.H., Molenaar, I.W. (eds.) (1995) Rasch models: foundations, recent developments, and applications. New York: Springer-Verlag. p. 41-43).

< Subset split here >

The persons in this subset responded to different items than persons in other subsets. We don't know if these items are easier or harder than items in other subsets.

< Indirect connection >

The persons responded to different items, but they are connected by a loop of successes and failures.

< undetected Guttman split here >

Winsteps subset-detection did not report than persons 13 and 14 always score lower than persons 15 and 16, causing a Guttman split. We do not know how much better persons 15 and 16 are than persons 14 and 15. Winsteps subset-detection may fail to report subsets. Unreported subsets usually cause big jumps in the reported measures.

Data are ambiguously connected

Measures for persons in different subsets are not comparable. Winsteps always reports measures, but these are only valid within subsets. We do not know how the measures for persons in one subset compare with the measures for persons in another subset.  Reliability coefficients are accidental and so is Table 20, the score-to-measure Table.  Fit statistics and standard errors are approximately correct.

Measures may not be comparable across subsets

Please always investigate when Winsteps reports subsets, even if you think that all your measures are comparable.

MAXIMUM MEASURE, MINIMUM MEASURE, DROPPED, INESTIMABLE

Persons and items with special features are not included in subsets. Extreme scores (zero, minimum possible and perfect, maximum possible scores) imply measures that are beyond the current frame of reference. Winsteps uses Bayesian logic to provide measures corresponding to those scores.

SUBSET 1, 2, 4

These are directly connected subsets. Within each subset, a  person has succeeded on an item and failed on an item, and vice-versa. The person performances are directly pairwise comparable within the subset. The persons in this subset have either succeeded on items in other subsets, or failed on items in other subsets, or have missing data on items in other subsets.

SUBSET 3

These two persons have the same responses, so they are in the same subset.

No one succeeded on their failed items item, and also failed on their successful item.

SUBSET 5

This is an indirectly connected subset. There is a loop of successes and failures so that the performances of all three persons are connected indirectly pairwise.

SUBSET 6

Persons 13 and 14 are directly comparable using categories 0 and 1 of the rating scale. Persons 15 and 16 are directly comparable using categories 2 and 3 of the rating scale. Winsteps has not detected that persons 13 and 14 always rate lower than persons 15 and 16, causing a Guttman split.

SUBSET 7 (Table 14.1)

No person is in the same subset as this item. There is no subset in which persons both succeeded and failed on this item.

Connecting SUBSETs

Here are approaches:

1. Collect more data that links items across subsets. Please start Winsteps analysis as soon as you start data collection. Then subset problems can be remedied before data collection ends.

2. Dummy data. Include data for imaginary people in the data file that connects the subsets.

3. Anchor persons or items. Anchor equivalent items (or equivalent persons) in the different subsets to the same values - or juggle the anchor values to make the mean of each subset the same (or whatever)

4. Analyze each subset of persons and items separately. In Table 0.4, Winsteps reports entry numbers for each person and each item in each subset, so that you can compare their response strings. To analyze only the items and persons in a particular subset, such as subset 4 above, specify the items and persons in the subset:

IDELETE= +9-10

PDELETE= +10-11

Memory was not allocatable to probe connectivity

If the data are complete, ignore this message. If the data are sparse, add dummy data records. They will have little influence on connected data, but will connected up data with subsets. See also Memory

 

Table 14.1

 

---------  -----------------

|ENTRY     |               |

|NUMBER    | ITEM        G |

|--------  +---------------|

|     1    | 01 Subset 1 D | SUBSET 1

|     2    | 02 Subset 1 D | SUBSET 1

                                      < Guttman split here >

|     3    | 03 Subset 2 D | SUBSET 2

|     4    | 04 Subset 2 D | SUBSET 2

                                      < Guttman split here >

|     5    | 05 Subset 7 D | SUBSET 7

                                      < Guttman split here >

|     6    | 06 Subset 4 D | SUBSET 4

|     7    | 07 Subset 4 D | SUBSET 4

                                      < Guttman split here >

|     8    | 08 Subset 5 D | SUBSET 5

|     9    | 09 Subset 5 D | SUBSET 5

|    10    | 10 Subset 5 D | SUBSET 5

                                      < Guttman split here >

|    11    | 11 Subset 6 R | SUBSET 6

|    12    | 12 Subset 6 R | SUBSET 6

|--------  +---------------|

 

Table 0.4 reports

 

SUBSET DETAILS

 

Subset 1 of 2 ITEM and 2 PERSON

 ITEM: 1-2

 PERSON: 2-3

Subset 2 of 2 ITEM and 2 PERSON

 ITEM: 3-4

 PERSON: 4-5

Subset 3 of 2 PERSON

 PERSON: 6-7

Subset 4 of 2 ITEM and 2 PERSON

 ITEM: 6-7

 PERSON: 8-9

Subset 5 of 3 ITEM and 3 PERSON

 ITEM: 8-10

 PERSON: 10-12

Subset 6 of 2 ITEM and 4 PERSON

 ITEM: 11-12

 PERSON: 13-16

Subset 7 of 1 ITEM

 ITEM: 5

 

 

 

Example 2: Analyzing two separate datasets together.

Dataset 1. The Russian students take the Russian items. This is connected. All the data are in one subset.

Dataset 2. The American students take the American items. This is connected. All the data are in one subset.

Dataset 3. Datasets 1 and 2 are put into one analysis. This is not connected. The data form two subsets: the Russian one and the American one. The raw scores or Rasch measures of the Russian students cannot be compared to those of the American students. For instance, if the Russian students score higher than the American students, are the Russian students more able or are the Russian items easier? The data cannot tell us which is true.

 

Winsteps attempts to estimate an individual measure for each person and item within one frame of reference. Usually this happens. But there are exceptions.

 


 

The initial implimentation used the algorithm of  David L. Weeks  Donald R. Williams Technometrics 6:3 p.319-324 8/1964, but this fails for indirect linking.


Help for Winsteps Rasch Measurement and Rasch Analysis Software: www.winsteps.com. Author: John Michael Linacre

Facets Rasch measurement software. Buy for $149. & site licenses. Freeware student/evaluation Minifac download
Winsteps Rasch measurement software. Buy for $149. & site licenses. Freeware student/evaluation Ministep download

Rasch Books and Publications: Winsteps and Facets
Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland
Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes Rasch Models for Solving Measurement Problems (Facets), George Engelhard, Jr. & Jue Wang Statistical Analyses for Language Testers (Facets), Rita Green Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M
Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind Rasch Measurement: Applications, Khine Winsteps Tutorials - free
Facets Tutorials - free
Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan
Other Rasch-Related Resources: Rasch Measurement YouTube Channel
Rasch Measurement Transactions & Rasch Measurement research papers - free An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse Rasch Measurement Theory Analysis in R, Wind, Hua Applying the Rasch Model in Social Sciences Using R, Lamprianou Journal of Applied Measurement
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Rasch Models for Measurement, David Andrich Constructing Measures, Mark Wilson Best Test Design - free, Wright & Stone
Rating Scale Analysis - free, Wright & Masters
Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias Diseño de Mejores Pruebas - free, Spanish Best Test Design A Course in Rasch Measurement Theory, Andrich, Marais Rasch Models in Health, Christensen, Kreiner, Mesba Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen
As an Amazon Associate I earn from qualifying purchases. This does not change what you pay.

facebook Forum: Rasch Measurement Forum to discuss any Rasch-related topic

To receive News Emails about Winsteps and Facets by subscribing to the Winsteps.com email list,
enter your email address here:

I want to Subscribe: & click below
I want to Unsubscribe: & click below

Please set your SPAM filter to accept emails from Winsteps.com
The Winsteps.com email list is only used to email information about Winsteps, Facets and associated Rasch Measurement activities. Your email address is not shared with third-parties. Every email sent from the list includes the option to unsubscribe.

Questions, Suggestions? Want to update Winsteps or Facets? Please email Mike Linacre, author of Winsteps mike@winsteps.com


State-of-the-art : single-user and site licenses : free student/evaluation versions : download immediately : instructional PDFs : user forum : assistance by email : bugs fixed fast : free update eligibility : backwards compatible : money back if not satisfied
 
Rasch, Winsteps, Facets online Tutorials


 

 
Coming Rasch-related Events
May 17 - June 21, 2024, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 12 - 14, 2024, Wed.-Fri. 1st Scandinavian Applied Measurement Conference, Kristianstad University, Kristianstad, Sweden http://www.hkr.se/samc2024
June 21 - July 19, 2024, Fri.-Fri. On-line workshop: Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
Aug. 5 - Aug. 6, 2024, Fri.-Fri. 2024 Inaugural Conference of the Society for the Study of Measurement (Berkeley, CA), Call for Proposals
Aug. 9 - Sept. 6, 2024, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
Oct. 4 - Nov. 8, 2024, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Jan. 17 - Feb. 21, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
May 16 - June 20, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 20 - July 18, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com
Oct. 3 - Nov. 7, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com

 

 

Our current URL is www.winsteps.com

Winsteps® is a registered trademark