﻿ Table 8.1 Rating scale statistics

Table 8.1 Rating (or partial credit) scale statistics

For each modeled scale code in Models= and Rating (or partial credit) scale= with observations found in Data=, a table is produced. The heading describes to which model the scale applies. Only columns applicable to the type of scale are output.

+----------------------------------------------------------------------------------------------------------------------------------+

|           DATA                 |  QUALITY CONTROL  |RASCH-ANDRICH|  EXPECTATION  |  MOST  |  RASCH-  | Cat| Obsd-Expd|Response   |

|      Category Counts       Cum.| Avge  Exp.  OUTFIT| Thresholds  |  Measure at   |PROBABLE| THURSTONE|PEAK|Diagnostic|Category   |

|Score Total      Used    %    % | Meas  Meas   MnSq |Measure  S.E.|Category  -0.5 |  from  |Thresholds|Prob| Residual |  Name     |

|--------------------------------+-------------------+-------------+---------------+--------+----------+----+----------+-----------|

|  0     452       378   20%  20%|  -.87  -1.03  1.2 |             |( -2.04)       |   low  |   low    |100%|      -.9 | dislike   |

|  1     620       620   34%  54%|   .13    .33   .7 |  -.85    .07|    .00   -1.17|   -.85 |  -1.00   | 54%|          | don't know|

|  2     864       852   46% 100%|  2.23   2.15  1.5 |   .85    .06|(  2.05)   1.18|    .85 |    .99   |100%|          | like      |

+---------------------------------------------------------------------(Mean)---------(Modal)--(Median)-----------------------------+

DATA

Information relating to the data

Score

Cardinal value assigned to each category, i.e., its rating.

If two Scores are shown, the second score is after structural zeroes have been removed

Category Counts

Total =

Used  =

Number of observations of this category in the analysis

Number of observations that participated in the estimation (excludes extreme scores)

%

Percent of the Used responses which are in this category.

The observed probability is the "Category Counts %" divided by 100.

The probability of paired agreement by chance is sum (probability of each category**2) across all the categories. In the Table above,

Probabilities = .20, .34, .46  (they sum to 1.0)

Probability**2 = Probability * Probability = .04, .13, .21

Sum (Probability**2) = .04 + .13 + .21 = .38 = Agreement by chance

Cum. %

Percent of the Used responses in or below this category.

QUALITY CONTROL

Information relating to the validity of the categorization.

Avge Meas

The average of the measures that are modeled to generate the observations in this category. If Average Measure does not increase with each higher category, then the category average measure is flagged with a "*", and doubt is cast on the idea that higher categories correspond to "more" of the variable.

Exp. Meas

The expected value of the average measure if these data fit the Rasch model.

OUTFIT MnSq

The unweighted mean-square for observations in this category.

Mean-squares have expectation of 1.0. Values much larger than 1.0 indicate unexpected observations in this category. Extreme categories have greater opportunity for large mean-squares than central categories.

The INFIT MnSq is not reported because it approximates the OUTFIT MnSq when the data are stratified by category.

RASCH-ANDRICH THRESHOLDS

Step calibrations, rating scale structure

Measure

value of the Rasch-Andrich threshold, the location on the latent variable (relative to the center of the rating scale) where adjacent categories are equally probable. This is the Rasch model parameter. Use this for anchoring rating scale, or for estimation starting-values.

S.E.

standard error of the Rasch-Andrich threshold (step calibration).

For dichotomous items, no S.E. is reported because there is no estimable threshold parameter.

EXPECTATION Measure at

gives the details of the logit-to-expected-score ogive. This is the expected (mean) value of the observations for measures at this point on the latent variable relative to the rating scale.

at Category

logit measure for the expected score on the rating scale corresponding to the value in the category score column. Measures corresponding to extreme responses, e.g., (-2.70), correspond to expected responses 0.25 score points from the extreme response, i.e., half way between the extreme response and 0.5 score points.

at -0.5

logit measure for the expected score corresponding to the value in the category score column less 0.5 score points. These can be thought of as the transition points into one expected score from the one below.

MOST PROBABLE from

lowest measure at which this category is the one most probable to be observed. It continues to be the most probable (modal) category until a numerically higher category becomes most probable.

"low"

indicates the most probable category at the low end of the scale.

"no"

indicates this category is never the most probable to be observed for any measure.

RASCH-THURSTONE Thresholds

measure at which the probability of being rated in this category or above equals that of being rated in any of the category below, i.e., is .5., i.e., the 50% (median) cumulative probability threshold.

Cat PEAK Prob

The largest percentage probability this category has of being observed at any measure. Extreme categories have a maximum probability of 100% at the extremes of the measurement continuum. Intermediate categories have their peak probabilities when the expected response value is numerically equal to the intermediate category's response value, the "at Category" value.

Obsd-Expd Diagnostic Residual

This column is produced only when the difference between the observed count of responses and the expected count, based on the Rasch measures, is greater than 0.5 for some category. This can be due to

i) lack of convergence: set smaller values in Convergence=

ii) anchor values incompatible with the data

iii) responses do not match the specified scale structure, e.g., Poisson counts.

iv) contradictory modeling, e.g., models = ?,?,#,#,R6 can imply contradictory estimates for elements.

Response Category Name =

name of category from Rating (or partial credit) scale= specification

Optimizing Rating-Scale Categorization: When/How to Collapse Categories for Better Measurement

Classical Test Theory says "The categorization with the highest person "test" reliability is the best". We can evaluate this by looking at the Reliability in Facets Table 7 of the person facet. A similar investigation is done at www.rasch.org/rmt/rmt101k.htm

Rasch Theory says "Each advancing category of the rating scale corresponds to one higher qualitative level of performance." We can evaluate this by looking at the "Avge Meas" (Average Measure) for the rating scale in Facets Table 8.1. The average measures should advance and be close to their "Exp. Meas" (Expected Measures). Collapse together Average Measures that are disordered or very close together. Also, to avoid accidents in the data biasing results, we want no category to have less than 10 ratings. We also like to see that each category has reasonable fit statistics. If you intend to make inferences at the category level (as opposed to the overall score/measure level) then the Rasch-Andrich Thresholds should also advance.

Unobserved Categories: Structural Zeroes or Incidental (Sampling) Zeroes

A category cannot be observed, and is omitted from qualitative levels. (The default.)

"Category" shows the category number. "Score" shows the value used for analysis.

Model = ?,?,R3

+----------------------------------------------------------------------------------------------------------------+

|         DATA             |  QUALITY CONTROL  |RASCH-ANDRICH|  EXPECTATION  |  MOST  |.5 Cumultv| Cat| Obsd-Expd|

| Category     Counts  Cum.| Avge  Exp.  OUTFIT| THRESHOLDS  |  Measure at   |PROBABLE|Probabilty|PEAK|Diagnostic|

|   Score    Used   %    % | Meas  Meas   MnSq |Measure  S.E.|Category  -0.5 |  from  |    at    |Prob| Residual |

|--------------------------+-------------------+-------------+---------------+--------+----------+----+----------|

|  0   0      378  20%  20%|  -.87  -1.03  1.2 |             |( -2.04)       |   low  |   low    |100%|      -.9 |

|  1   1      620  34%  54%|   .13    .33   .7 |  -.85    .07|    .00   -1.17|   -.85 |  -1.00   | 54%|          |

|  2                       |                   |             |               |        |          |    |          |

|  3   2      852  46% 100%|  2.23   2.15  1.5 |   .85    .06|(  2.05)   1.18|    .85 |    .99   |100%|          |

+---------------------------------------------------------------(Mean)---------(Modal)--(Median)-----------------+

A category can be observed (but not in this dataset). It is included in the qualitative levels. (Keep.) The "Category Score" is the value used for analysis.

Model = ?,?,R3K   <= K means "Keep unobserved intermediate categories"

+------------------------------------------------------------------------------------------------------------+

|      DATA            |  QUALITY CONTROL  |RASCH-ANDRICH|  EXPECTATION  |  MOST  |.5 Cumultv| Cat| Obsd-Expd|

| Category Counts  Cum.| Avge  Exp.  OUTFIT| THRESHOLDS  |  Measure at   |PROBABLE|Probabilty|PEAK|Diagnostic|

|Score   Used   %    % | Meas  Meas   MnSq |Measure  S.E.|Category  -0.5 |  from  |    at    |Prob| Residual |

|----------------------+-------------------+-------------+---------------+--------+----------+----+----------|

|  0      378  20%  20%|  -.68   -.74  1.2 |             |( -1.99)       |   low  |   low    |100%|      1.0 |

|  1      620  34%  54%|  -.11   -.06   .6 |  -.90    .07|   -.23   -1.09|   -.90 |   -.95   | 56%|      -.7 |

|  2        0   0%  54%|                   |             |    .63     .24|        |    .55   |  0%|          |

|  3      852  46% 100%|  1.35   1.34  1.7 |   .90    .07|(  1.50)   1.10|    .90 |    .55   |100%|          |

+-----------------------------------------------------------(Mean)---------(Modal)--(Median)-----------------+

Equivalence of Facets Table 8 with Winsteps Table 3.2

Facets Table 8

+------------------------------------------------------------------------------------------------------------------------+

|      DATA            |  QUALITY CONTROL  |RASCH-ANDRICH|  EXPECTATION  |  MOST  |  RASCH-  | Cat| Obsd-Expd|Response   |

| Category Counts  Cum.| Avge  Exp.  OUTFIT| Thresholds  |  Measure at   |PROBABLE| THURSTONE|PEAK|Diagnostic|Category   |

|Score   Used   %    % | Meas  Meas   MnSq |Measure  S.E.|Category  -0.5 |  from  |Thresholds|Prob| Residual |  Name     |

|----------------------+-------------------+-------------+---------------+--------+----------+----+----------+-----------|

|  0      378  20%  20%|  -.87  -1.03  1.2 |             |( -2.04)       |   low  |   low    |100%|      -.9 | dislike   |

|  1      620  34%  54%|   .13    .33   .7 |  -.85    .07|    .00   -1.17|   -.85 |  -1.00   | 54%|          | don't know|

|  2      852  46% 100%|  2.23   2.15  1.5 |   .85    .06|(  2.05)   1.18|    .85 |    .99   |100%|          | like      |

+-----------------------------------------------------------(Mean)---------(Modal)--(Median)-----------------------------+

Winsteps Table 3.2

-------------------------------------------------------------------

|CATEGORY   OBSERVED|OBSVD SAMPLE|INFIT OUTFIT||STRUCTURE|CATEGORY|

|LABEL SCORE COUNT %|AVRGE EXPECT|  MNSQ  MNSQ||CALIBRATN| MEASURE|

|-------------------+------------+------------++---------+--------|

|  0   0     378  20|  -.87 -1.03|  1.08  1.19||  NONE   |( -2.07)| 0 Dislike

|  1   1     620  34|   .13   .33|   .85   .69||    -.86 |    .00 | 1 Neutral

|  2   2     852  46|  2.24  2.16|  1.00  1.47||     .86 |(  2.07)| 2 Like

-------------------------------------------------------------------

---------------------------------------------------------------------------

|CATEGORY    STRUCTURE   |  SCORE-TO-MEASURE   | 50% CUM.| COHERENCE|ESTIM|

| LABEL    MEASURE  S.E. | AT CAT. ----ZONE----|PROBABLTY| M->C C->M|DISCR|

|------------------------+---------------------+---------+----------+-----|

|   0      NONE          |( -2.07) -INF   -1.19|         |  62%  42%|     | 0 Dislike

|   1        -.86    .07 |    .00  -1.19   1.19|   -1.00 |  54%  71%|  .73| 1 Neutral

|   2         .86    .06 |(  2.07)  1.19  +INF |    1.00 |  85%  78%| 1.19| 2 Like

---------------------------------------------------------------------------

 Winsteps field: Facets field: CATEGORY LABEL Category Score CATEGORY SCORE Category Score COUNT Used % % ... Cum. % OBSVD AVRGE Avge Meas SAMPLE EXPECT Exp. Meas INFIT MNSQ ... OUTFIT MNSQ OUTFIT MnSq STRUCTURE CALIBRATN RASCH-ANDRICH Thresholds Measure *CATEGORY MEASURE EXPECTATION Measure at Category *STRUCTURE MEASURE RASCH-ANDRICH Thresholds Measure STRUCTURE S.E. RASCH-ANDRICH Thresholds S.E. *SCORE-TO-MEASURE AT CAT. EXPECTATION Measure at Category *SCORE-TO-MEASURE --ZONE-- EXPECTATION Measure at -0.5 *50% CUM. PROBALTY RASCH-THURSTONE Thresholds COHERENCE M->C ... COHERENCE C->M ... ESTIM DISCR ... ... MOST PROBABLE from ... Cat PEAK Prob OBSERVED-EXPECTED RESIDUAL DIFFERENCE Obsd-Expd Diagnostic Residual (text to right of table) Response Category Name * = In Winsteps only, includes item difficulty for Partial Credit model

Help for Facets Rasch Measurement Software: www.winsteps.com Author: John Michael Linacre.

 Forum Rasch Measurement Forum to discuss any Rasch-related topic

Rasch Publications
Rasch Measurement Transactions (free, online) Rasch Measurement research papers (free, online) Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Applying the Rasch Model 3rd. Ed., Bond & Fox Best Test Design, Wright & Stone
Rating Scale Analysis, Wright & Masters Introduction to Rasch Measurement, E. Smith & R. Smith Introduction to Many-Facet Rasch Measurement, Thomas Eckes Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments, George Engelhard, Jr. & Stefanie Wind Statistical Analyses for Language Testers, Rita Green
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Journal of Applied Measurement Rasch models for measurement, David Andrich Constructing Measures, Mark Wilson Rasch Analysis in the Human Sciences, Boone, Stave, Yale
in Spanish: Análisis de Rasch para todos, Agustín Tristán Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez
Winsteps Tutorials Facets Tutorials Rasch Discussion Groups

Coming Rasch-related Events
Jan. 5 - Feb. 2, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Jan. 10-16, 2018, Wed.-Tues. In-person workshop: Advanced Course in Rasch Measurement Theory and the application of RUMM2030, Perth, Australia (D. Andrich), Announcement
Jan. 17-19, 2018, Wed.-Fri. Rasch Conference: Seventh International Conference on Probabilistic Models for Measurement, Matilda Bay Club, Perth, Australia, Website
Jan. 22-24, 2018, Mon-Wed. In-person workshop: Rasch Measurement for Everybody en español (A. Tristan, Winsteps), San Luis Potosi, Mexico. www.ieia.com.mx
April 10-12, 2018, Tues.-Thurs. Rasch Conference: IOMW, New York, NY, www.iomw.org
April 13-17, 2018, Fri.-Tues. AERA, New York, NY, www.aera.net
May 22 - 24, 2018, Tues.-Thur. EALTA 2018 pre-conference workshop (Introduction to Rasch measurement using WINSTEPS and FACETS, Thomas Eckes & Frank Weiss-Motz), https://ealta2018.testdaf.de
May 25 - June 22, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 27 - 29, 2018, Wed.-Fri. Measurement at the Crossroads: History, philosophy and sociology of measurement, Paris, France., https://measurement2018.sciencesconf.org
June 29 - July 27, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
July 25 - July 27, 2018, Wed.-Fri. Pacific-Rim Objective Measurement Symposium (PROMS), (Preconference workshops July 23-24, 2018) Fudan University, Shanghai, China "Applying Rasch Measurement in Language Assessment and across the Human Sciences" www.promsociety.org
Aug. 10 - Sept. 7, 2018, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
Sept. 3 - 6, 2018, Mon.-Thurs. IMEKO World Congress, Belfast, Northern Ireland www.imeko2018.org
Oct. 12 - Nov. 9, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com

Our current URL is www.winsteps.com