Subsets and connection ambiguities in the data

You see: Warning: Data are ambiguously connected into 6 subsets. Measures may not be comparable across subsets.

 

Quick (but arbitrary) solution:  add to the data file two dummy person records so that all persons and items become directly comparable.

 

Dichotomous data:

Dummy person 1: responses: 010101010...

Dummy person 2: responses: 101010101...

 

Rating scale data, where "1" is the lowest category, and "5" is the highest category:

Dummy person 1: responses: 151515151...

Dummy person 2: responses: 515151515...

 

Explanation: Connectivity (or subsetting) is a concern in any data analysis involving missing data. In general,

nested data are not connected.

fully-crossed data (also called "complete data") are connected.

partially-crossed data may or may not be connected.

 

Winsteps examines the responses strings for all the persons. It verifies that every non-extreme response string is linked into one network of success and failure on the items. Similarly, the strings of responses to the items are linked into one network of success and failure by the persons.

 

If person response string A has a success on item 1 and a failure on item 2, and response string B has a failure on item 1 and a success on item 2, then A and B are connected. This examination is repeated for all pairs of response strings and all pairs of items. Gradually all the persons are connected with all the other persons, and all the items are connected with all the other items. But it some persons or some items cannot be connected in this way, then Winsteps reports a "connectivity" problem, and reports which subsets of items and persons are connected.

 

Example 1: Connection problems and subsets in the data are shown in this dataset. It is Examsubs.txt.

 

Title = "Example of subset reporting"

Name1 = 1

Namelength = 24 ;  include response string in person label

Item1 = 13

NI = 12

CODES = 0123 ; x is missing data

ISGROUPS = DDDDDDDDDDRR ; items 1-10 are dichotomies; items 11-12 share a rating scale

MUCON = 3 ; Subsetting can cause very slow convergence

TFILE=*

18.1

14.1

0.4

*

&End

01 Subset 1

02 Subset 1

03 Subset 2

04 Subset 2

05 Subset 7

06 Subset 4

07 Subset 4

08 Subset 5

09 Subset 5

10 Subset 5

11 Subset 6

12 Subset 6

END LABELS

01 Extreme  11111       

02 Subset 1 01111       

03 Subset 1 10111       

04 Subset 2 00101       

05 Subset 2 00011       

06 Subset 3     011

07 Subset 3     011

08 Subset 4     001     

09 Subset 4     010     

10 Subset 5        0x1   

11 Subset 5        10x   

12 Subset 5        x10

13 Subset 6           01

14 Subset 6           10

15 Subset 6           23

16 Subset 6           32 

 

The Iteration Screen reports:

                              CONVERGENCE TABLE

 -Control: \HOLDW95\examples\examsubs.txt Output: \examples\ZOU571WS.TXT

 |    PROX          ACTIVE COUNT       EXTREME 5 RANGE      MAX LOGIT CHANGE  |

 | ITERATION  PERSON    ITEM  CATS     PERSON    ITEM      MEASURES  STRUCTURE|

 >=====================================<

 |        1       15      12     8       2.00    1.06       -2.0794           |

 >=====================================<

 |        2       15      12     6       2.38    1.84        2.6539   -1.6094 |

 >=====================================<

 |        3       14      12     6       2.67    1.60        2.7231     .0000 |

 >=====================================<

 |        4       14      12     6       2.68    2.33       -2.3912     .0000 |

 >=====================================<

 |        5       14      12     6       2.97    1.77        2.3246           |

 >=====================================<

 |        6       14      12     6       2.97    2.54       -2.2191           |

 >=====================================<

 |        7       14      12     6       3.22    2.10        2.0372           |

 Probing data connection: to skip out: Ctrl+F - to bypass: subset=no

 Processing unanchored persons ...

 >=====================================<

 Consolidating 9 potential subsets pairwise ...

 >=================================

 Consolidating 9 potential subsets indirectly pairwise ...

 >=====================================<

 Consolidating 8 potential subsets pairwise ...

 >=================================

 Consolidating 7 potential subsets pairwise ...

 >=================================

 Consolidating 7 potential subsets indirectly pairwise ...

 >=====================================<

 Warning: Data are ambiguously connected into 7 subsets. Measures may not be comparable across subsets.

  Subsets details are in Table 0.4

 

Table 18.1

 

         PERSON STATISTICS:  ENTRY ORDER

 

---------  ---------------------------

|ENTRY     |                         |

|NUMBER    | PERSON                  |

|--------  +-------------------------|

|     1    | 01 Extreme  11111       | MAXIMUM MEASURE

                                                < Guttman split here >

|     2    | 02 Subset 1 01111       | SUBSET 1

|     3    | 03 Subset 1 10111       | SUBSET 1

                                                < Guttman split here >

|     4    | 04 Subset 2 00101       | SUBSET 2

|     5    | 05 Subset 2 00011       | SUBSET 2

                                                < Guttman split here >

|     6    | 06 Subset 3     011     | SUBSET 3

|     7    | 07 Subset 3     011     | SUBSET 3

                                                < Guttman split here >

|     8    | 07 Subset 4     001     | SUBSET 4

|     9    | 09 Subset 4     010     | SUBSET 4

                                                < Subset split here >

|    10    | 10 Subset 5        0x1  | SUBSET 5

|    11    | 11 Subset 5        10x  | SUBSET 5 < Indirect connection >

|    12    | 12 Subset 5        x10  | SUBSET 5

                                                < Subset split here>

|    13    | 13 Subset 6           01| SUBSET 6

|    14    | 14 Subset 6           10| SUBSET 6

                                                < undetected Guttman split here: Winsteps failed! >

|    15    | 15 Subset 6           23| SUBSET 6

|    16    | 16 Subset 6           32| SUBSET 6

|--------  +-------------------------|

 

In Tables and Notes:

Explanation:

< Guttman split here >

The persons above the split performed an unknowable amount different from the persons below the split. There is no item on which this subset succeeded and another subset failed, and also this subset failed and the other subset succeeded. The data are not "well-conditioned" (Fischer G.H., Molenaar, I.W. (eds.) (1995) Rasch models: foundations, recent developments, and applications. New York: Springer-Verlag. p. 41-43).

< Subset split here >

The persons in this subset responded to different items than persons in other subsets. We don't know if these items are easier or harder than items in other subsets.

< Indirect connection >

The persons responded to different items, but they are connected by a loop of successes and failures.

< undetected Guttman split here >

Winsteps subset-detection did not report than persons 13 and 14 always score lower than persons 15 and 16, causing a Guttman split. We do not know how much better persons 15 and 16 are than persons 14 and 15. Winsteps subset-detection may fail to report subsets. Unreported subsets usually cause big jumps in the reported measures.

Data are ambiguously connected

Measures for persons in different subsets are not comparable. Winsteps always reports measures, but these are only valid within subsets. We do not know how the measures for persons in one subset compare with the measures for persons in another subset.  Reliability coefficients are accidental and so is Table 20, the score-to-measure Table.  Fit statistics and standard errors are approximately correct.

Measures may not be comparable across subsets

Please always investigate when Winsteps reports subsets, even if you think that all your measures are comparable.

MAXIMUM MEASURE, MINIMUM MEASURE, DROPPED, INESTIMABLE

Persons and items with special features are not included in subsets. Extreme scores (zero, minimum possible and perfect, maximum possible scores) imply measures that are beyond the current frame of reference. Winsteps uses Bayesian logic to provide measures corresponding to those scores.

SUBSET 1, 2, 4

These are directly connected subsets. Within each subset, a  person has succeeded on an item and failed on an item, and vice-versa. The person performances are directly pairwise comparable within the subset. The persons in this subset have either succeeded on items in other subsets, or failed on items in other subsets, or have missing data on items in other subsets.

SUBSET 3

These two persons have the same responses, so they are in the same subset.

No one succeeded on their failed items item, and also failed on their successful item.

SUBSET 5

This is an indirectly connected subset. There is a loop of successes and failures so that the performances of all three persons are connected indirectly pairwise.

SUBSET 6

Persons 13 and 14 are directly comparable using categories 0 and 1 of the rating scale. Persons 15 and 16 are directly comparable using categories 2 and 3 of the rating scale. Winsteps has not detected that persons 13 and 14 always rate lower than persons 15 and 16, causing a Guttman split.

SUBSET 7 (Table 14.1)

No person is in the same subset as this item. There is no subset in which persons both succeeded and failed on this item.

Connecting SUBSETs

Here are approaches:

1. Collect more data that links items across subsets. Please start Winsteps analysis as soon as you start data collection. Then subset problems can be remedied before data collection ends.

2. Dummy data. Include data for imaginary people in the data file that connects the subsets.

3. Anchor persons or items. Anchor equivalent items (or equivalent persons) in the different subsets to the same values - or juggle the anchor values to make the mean of each subset the same (or whatever)

4. Analyze each subset of persons and items separately. In Table 0.4, Winsteps reports entry numbers for each person and each item in each subset, so that you can compare their response strings. To analyze only the items and persons in a particular subset, such as subset 4 above, specify the items and persons in the subset:

IDELETE= +9-10

PDELETE= +10-11

 

Table 14.1

 

---------  -----------------

|ENTRY     |               |

|NUMBER    | ITEM        G |

|--------  +---------------|

|     1    | 01 Subset 1 D | SUBSET 1

|     2    | 02 Subset 1 D | SUBSET 1

                                      < Guttman split here >

|     3    | 03 Subset 2 D | SUBSET 2

|     4    | 04 Subset 2 D | SUBSET 2

                                      < Guttman split here >

|     5    | 05 Subset 7 D | SUBSET 7

                                      < Guttman split here >

|     6    | 06 Subset 4 D | SUBSET 4

|     7    | 07 Subset 4 D | SUBSET 4

                                      < Guttman split here >

|     8    | 08 Subset 5 D | SUBSET 5

|     9    | 09 Subset 5 D | SUBSET 5

|    10    | 10 Subset 5 D | SUBSET 5

                                      < Guttman split here >

|    11    | 11 Subset 6 R | SUBSET 6

|    12    | 12 Subset 6 R | SUBSET 6

|--------  +---------------|

 

Table 0.4 reports

 

SUBSET DETAILS

 

Subset 1 of 2 ITEM and 2 PERSON

 ITEM: 1-2

 PERSON: 2-3

Subset 2 of 2 ITEM and 2 PERSON

 ITEM: 3-4

 PERSON: 4-5

Subset 3 of 2 PERSON

 PERSON: 6-7

Subset 4 of 2 ITEM and 2 PERSON

 ITEM: 6-7

 PERSON: 8-9

Subset 5 of 3 ITEM and 3 PERSON

 ITEM: 8-10

 PERSON: 10-12

Subset 6 of 2 ITEM and 4 PERSON

 ITEM: 11-12

 PERSON: 13-16

Subset 7 of 1 ITEM

 ITEM: 5

 

Example 2: Analyzing two separate datasets together.

Dataset 1. The Russian students take the Russian items. This is connected. All the data are in one subset.

Dataset 2. The American students take the American items. This is connected. All the data are in one subset.

Dataset 3. Datasets 1 and 2 are put into one analysis. This is not connected. The data form two subsets: the Russian one and the American one. The raw scores or Rasch measures of the Russian students cannot be compared to those of the American students. For instance, if the Russian students score higher than the American students, are the Russian students more able or are the Russian items easier? The data cannot tell us which is true.

 

Winsteps attempts to estimate an individual measure for each person and item within one frame of reference. Usually this happens. But there are exceptions.


Help for Winsteps Rasch Measurement Software: www.winsteps.com. Author: John Michael Linacre

For more information, contact info@winsteps.com or use the Contact Form
 

Facets Rasch measurement software. Buy for $149. & site licenses. Freeware student/evaluation download
Winsteps Rasch measurement software. Buy for $149. & site licenses. Freeware student/evaluation download

State-of-the-art : single-user and site licenses : free student/evaluation versions : download immediately : instructional PDFs : user forum : assistance by email : bugs fixed fast : free update eligibility : backwards compatible : money back if not satisfied
 
Rasch, Winsteps, Facets online Tutorials

 

Forum Rasch Measurement Forum to discuss any Rasch-related topic

Click here to add your email address to the Winsteps and Facets email list for notifications.

Click here to ask a question or make a suggestion about Winsteps and Facets software.

Rasch Publications
Rasch Measurement Transactions (free, online) Rasch Measurement research papers (free, online) Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Applying the Rasch Model 3rd. Ed., Bond & Fox Best Test Design, Wright & Stone
Rating Scale Analysis, Wright & Masters Introduction to Rasch Measurement, E. Smith & R. Smith Introduction to Many-Facet Rasch Measurement, Thomas Eckes Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments, George Engelhard, Jr. & Stefanie Wind Statistical Analyses for Language Testers, Rita Green
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Journal of Applied Measurement Rasch models for measurement, David Andrich Constructing Measures, Mark Wilson Rasch Analysis in the Human Sciences, Boone, Stave, Yale
in Spanish: Análisis de Rasch para todos, Agustín Tristán Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez
Winsteps Tutorials Facets Tutorials Rasch Discussion Groups

 


 

 
Coming Rasch-related Events
Jan. 5 - Feb. 2, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Jan. 10-16, 2018, Wed.-Tues. In-person workshop: Advanced Course in Rasch Measurement Theory and the application of RUMM2030, Perth, Australia (D. Andrich), Announcement
Jan. 17-19, 2018, Wed.-Fri. Rasch Conference: Seventh International Conference on Probabilistic Models for Measurement, Matilda Bay Club, Perth, Australia, Website
Jan. 22-24, 2018, Mon-Wed. In-person workshop: Rasch Measurement for Everybody en español (A. Tristan, Winsteps), San Luis Potosi, Mexico. www.ieia.com.mx
April 10-12, 2018, Tues.-Thurs. Rasch Conference: IOMW, New York, NY, www.iomw.org
April 13-17, 2018, Fri.-Tues. AERA, New York, NY, www.aera.net
May 22 - 24, 2018, Tues.-Thur. EALTA 2018 pre-conference workshop (Introduction to Rasch measurement using WINSTEPS and FACETS, Thomas Eckes & Frank Weiss-Motz), https://ealta2018.testdaf.de
May 25 - June 22, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 27 - 29, 2018, Wed.-Fri. Measurement at the Crossroads: History, philosophy and sociology of measurement, Paris, France., https://measurement2018.sciencesconf.org
June 29 - July 27, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
July 25 - July 27, 2018, Wed.-Fri. Pacific-Rim Objective Measurement Symposium (PROMS), (Preconference workshops July 23-24, 2018) Fudan University, Shanghai, China "Applying Rasch Measurement in Language Assessment and across the Human Sciences" www.promsociety.org
Aug. 10 - Sept. 7, 2018, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
Sept. 3 - 6, 2018, Mon.-Thurs. IMEKO World Congress, Belfast, Northern Ireland www.imeko2018.org
Oct. 12 - Nov. 9, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com

 

 

Our current URL is www.winsteps.com

Winsteps® is a registered trademark
 


 
Concerned about aches, pains, youthfulness? Mike and Jenny suggest Liquid Biocell