﻿ Reliability - separation - strata

# Reliability - separation - strata

(Separation) Reliability and Strata

These are reporting "reliably different".  These are the opposite of inter-rater reliability statistics that are intended to report "reliably the same."

The reported "Separation" Reliability is the Rasch equivalent of the KR-20 or Cronbach Alpha "test reliability" statistic, i.e., the ratio of "True variance" to "Observed variance" for the elements of the facet. This shows how reproducible is the ordering of the measures. This may or may not indicate how "good" the test is in other respects. High (near 1.0) person and item reliabilities are preferred. This "separation" reliability is somewhat the opposite of an interrater reliability, so low (near 0.0) judge and rater separation reliabilities are preferred.

Since the "true" variance of a sample can never be known, but only approximated, the "true" reliability can also only be approximated. All reported reliabilities, such as KR-20, Cronbach Alpha, and the Separation Reliability etc. are only approximations. These approximations are all attempts to compute:

"Separation" Reliability = True Variance / Observed Variance

Facets computes upper and lower boundary values for the region in which the true reliability lies. When SE=Model, the upper boundary, the "Model" reliability, is computed on the basis that all unexpectedness in the data is Rasch-predicted randomness.

When SE=Real,  The lower boundary, the "Real" reliability is computed on the basis that all unexpectedness in the data contradicts the Rasch model. The unknowable True reliability generally lies somewhere between these two. As contradictory sources of noise are remove from the data, the reported Model and Real reliabilities become closer, and the True Reliability approaches the Model Reliability.

The "model" reliability is based on the model standard errors, which are computed on the basis that all superfluous unexpectedness in the data is the randomness predicted by the Rasch model.

The "real" reliability is based on the hypothesis that superfluous randomness in the data contradicts the Rasch model:

Real S.E. = Model S.E. *  sqrt(Max(INFIT MnSq, 1))

Conventionally, only a Person Reliability is reported and called the "test reliability". Facets reports separation reliabilities for all facets. Separation reliability is estimated based on the premise that the elements are locally independent. Specifically that raters are acting as "independent experts", not as "scoring machines". But when the raters act as "scoring machines", then Facets overestimates reliability. It would be the same as running MCQ bubble sheets twice through an optical scanner, so doubling the amount of "items" per person, and then claiming that we had increased test reliability! To assist in identifying this situation, Facets reports to what extent the raters are acting as "independent experts", as aspect of inter-rater reliability, see Table 7 Agreement Statistics.

Separation = True S.D. / Average measurement error

This estimates the number of statistically distinguishable levels of performance in a normally distributed sample with the same "true S.D." as the empirical sample, when the tails of the normal distribution are modeled as due to measurement error. www.rasch.org/rmt/rmt94n.htm

Strata = (4*Separation + 1)/3

This estimates the number of statistically distinguishable levels of performance in a normally distributed sample with the same "true S.D." as the empirical sample, when the tails of the normal distribution are modeled as extreme "true" levels of performance. www.rasch.org/rmt/rmt163f.htm

So, is sample separation is 2, then strata are (4*2+1)/3 = 3.

Separation = 2: The test is able to statistically distinguish between high and low performers.

Strata = 3: The test is able to statistically distinguish between very high, middle and very low performers.

Strata vs. Separation: this depends on the nature of the measure distribution.

Statistically:

If it is hypothesized to be normal, then separation.

If it is hypothesized to be heavy-tailed, then strata.

Substantively:

If very high and very low scores are probably due to accidental circumstances, then separation.

If very high and very low scores are probably due to very high and very low abilities, then strata.

If in doubt, assume that outliers are accidental, and use separation.

Help for Facets Rasch Measurement Software: www.winsteps.com Author: John Michael Linacre.

 Forum Rasch Measurement Forum to discuss any Rasch-related topic

Rasch Publications
Rasch Measurement Transactions (free, online) Rasch Measurement research papers (free, online) Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Applying the Rasch Model 3rd. Ed., Bond & Fox Best Test Design, Wright & Stone
Rating Scale Analysis, Wright & Masters Introduction to Rasch Measurement, E. Smith & R. Smith Introduction to Many-Facet Rasch Measurement, Thomas Eckes Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments, George Engelhard, Jr. & Stefanie Wind Statistical Analyses for Language Testers, Rita Green
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Journal of Applied Measurement Rasch models for measurement, David Andrich Constructing Measures, Mark Wilson Rasch Analysis in the Human Sciences, Boone, Stave, Yale
in Spanish: Análisis de Rasch para todos, Agustín Tristán Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez
Winsteps Tutorials Facets Tutorials Rasch Discussion Groups

Coming Rasch-related Events
Jan. 5 - Feb. 2, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Jan. 10-16, 2018, Wed.-Tues. In-person workshop: Advanced Course in Rasch Measurement Theory and the application of RUMM2030, Perth, Australia (D. Andrich), Announcement
Jan. 17-19, 2018, Wed.-Fri. Rasch Conference: Seventh International Conference on Probabilistic Models for Measurement, Matilda Bay Club, Perth, Australia, Website
Jan. 22-24, 2018, Mon-Wed. In-person workshop: Rasch Measurement for Everybody en español (A. Tristan, Winsteps), San Luis Potosi, Mexico. www.ieia.com.mx
April 10-12, 2018, Tues.-Thurs. Rasch Conference: IOMW, New York, NY, www.iomw.org
April 13-17, 2018, Fri.-Tues. AERA, New York, NY, www.aera.net
May 22 - 24, 2018, Tues.-Thur. EALTA 2018 pre-conference workshop (Introduction to Rasch measurement using WINSTEPS and FACETS, Thomas Eckes & Frank Weiss-Motz), https://ealta2018.testdaf.de
May 25 - June 22, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 27 - 29, 2018, Wed.-Fri. Measurement at the Crossroads: History, philosophy and sociology of measurement, Paris, France., https://measurement2018.sciencesconf.org
June 29 - July 27, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
July 25 - July 27, 2018, Wed.-Fri. Pacific-Rim Objective Measurement Symposium (PROMS), (Preconference workshops July 23-24, 2018) Fudan University, Shanghai, China "Applying Rasch Measurement in Language Assessment and across the Human Sciences" www.promsociety.org
Aug. 10 - Sept. 7, 2018, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
Sept. 3 - 6, 2018, Mon.-Thurs. IMEKO World Congress, Belfast, Northern Ireland www.imeko2018.org
Oct. 12 - Nov. 9, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com

Our current URL is www.winsteps.com