Reliability - separation - strata
(Separation) Reliability and Strata
These are reporting "reliably different". These are the opposite of inter-rater reliability statistics that are intended to report "reliably the same."
The reported "Separation" Reliability is the Rasch equivalent of the KR-20 or Cronbach Alpha "test reliability" statistic, i.e., the ratio of "True variance" to "Observed variance" for the elements of the facet. This shows how reproducible is the ordering of the measures. This may or may not indicate how "good" the test is in other respects. High (near 1.0) person and item reliabilities are preferred. This "separation" reliability is somewhat the opposite of an interrater reliability, so low (near 0.0) judge and rater separation reliabilities are preferred.
Since the "true" variance of a sample can never be known, but only approximated, the "true" reliability can also only be approximated. All reported reliabilities, such as KR-20, Cronbach Alpha, and the Separation Reliability etc. are only approximations. These approximations are all attempts to compute:
"Separation" Reliability = True Variance / Observed Variance
Facets computes upper and lower boundary values for the region in which the true reliability lies. When SE=Model, the upper boundary, the "Model" reliability, is computed on the basis that all unexpectedness in the data is Rasch-predicted randomness.
When SE=Real, The lower boundary, the "Real" reliability is computed on the basis that all unexpectedness in the data contradicts the Rasch model. The unknowable True reliability generally lies somewhere between these two. As contradictory sources of noise are remove from the data, the reported Model and Real reliabilities become closer, and the True Reliability approaches the Model Reliability.
The "model" reliability is based on the model standard errors, which are computed on the basis that all superfluous unexpectedness in the data is the randomness predicted by the Rasch model.
The "real" reliability is based on the hypothesis that superfluous randomness in the data contradicts the Rasch model:
Real S.E. = Model S.E. * sqrt(Max(INFIT MnSq, 1))
Conventionally, only a Person Reliability is reported and called the "test reliability". Facets reports separation reliabilities for all facets. Separation reliability is estimated based on the premise that the elements are locally independent. Specifically that raters are acting as "independent experts", not as "scoring machines". But when the raters act as "scoring machines", then Facets overestimates reliability. It would be the same as running MCQ bubble sheets twice through an optical scanner, so doubling the amount of "items" per person, and then claiming that we had increased test reliability! To assist in identifying this situation, Facets reports to what extent the raters are acting as "independent experts", as aspect of inter-rater reliability, see Table 7 Agreement Statistics.
Separation = True S.D. / Average measurement error
This estimates the number of statistically distinguishable levels of performance in a normally distributed sample with the same "true S.D." as the empirical sample, when the tails of the normal distribution are modeled as due to measurement error. www.rasch.org/rmt/rmt94n.htm
Strata = (4*Separation + 1)/3
This estimates the number of statistically distinguishable levels of performance in a normally distributed sample with the same "true S.D." as the empirical sample, when the tails of the normal distribution are modeled as extreme "true" levels of performance. www.rasch.org/rmt/rmt163f.htm
So, is sample separation is 2, then strata are (4*2+1)/3 = 3.
Separation = 2: The test is able to statistically distinguish between high and low performers.
Strata = 3: The test is able to statistically distinguish between very high, middle and very low performers.
Strata vs. Separation: this depends on the nature of the measure distribution.
If it is hypothesized to be normal, then separation.
If it is hypothesized to be heavy-tailed, then strata.
If very high and very low scores are probably due to accidental circumstances, then separation.
If very high and very low scores are probably due to very high and very low abilities, then strata.
If in doubt, assume that outliers are accidental, and use separation.
Help for Facets Rasch Measurement Software: www.winsteps.com Author: John Michael Linacre.
For more information, contact firstname.lastname@example.org or use the Contact Form
|Facets Rasch measurement software.
Buy for $149. & site licenses.
Freeware student/evaluation download
Winsteps Rasch measurement software. Buy for $149. & site licenses. Freeware student/evaluation download
|State-of-the-art : single-user and site licenses : free student/evaluation versions : download immediately : instructional PDFs : user forum : assistance by email : bugs fixed fast : free update eligibility : backwards compatible : money back if not satisfied|
Rasch, Winsteps, Facets online Tutorials
|Forum||Rasch Measurement Forum to discuss any Rasch-related topic|
Click here to add your email address to the Winsteps and Facets email list for notifications.
Click here to ask a question or make a suggestion about Winsteps and Facets software.
|Rasch Measurement Transactions (free, online)||Rasch Measurement research papers (free, online)||Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch||Applying the Rasch Model 3rd. Ed., Bond & Fox||Best Test Design, Wright & Stone|
|Rating Scale Analysis, Wright & Masters||Introduction to Rasch Measurement, E. Smith & R. Smith||Introduction to Many-Facet Rasch Measurement, Thomas Eckes||Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments, George Engelhard, Jr. & Stefanie Wind||Statistical Analyses for Language Testers, Rita Green|
|Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar||Journal of Applied Measurement||Rasch models for measurement, David Andrich||Constructing Measures, Mark Wilson||Rasch Analysis in the Human Sciences, Boone, Stave, Yale|
|in Spanish:||Análisis de Rasch para todos, Agustín Tristán||Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez|
|Winsteps Tutorials||Facets Tutorials||Rasch Discussion Groups|
|Coming Rasch-related Events|
|April 10-12, 2018, Tues.-Thurs.||Rasch Conference: IOMW, New York, NY, www.iomw.org|
|April 13-17, 2018, Fri.-Tues.||AERA, New York, NY, www.aera.net|
|May 22 - 24, 2018, Tues.-Thur.||EALTA 2018 pre-conference workshop (Introduction to Rasch measurement using WINSTEPS and FACETS, Thomas Eckes & Frank Weiss-Motz), https://ealta2018.testdaf.de|
|May 25 - June 22, 2018, Fri.-Fri.||On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com|
|June 27 - 29, 2018, Wed.-Fri.||Measurement at the Crossroads: History, philosophy and sociology of measurement, Paris, France., https://measurement2018.sciencesconf.org|
|June 29 - July 27, 2018, Fri.-Fri.||On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com|
|July 25 - July 27, 2018, Wed.-Fri.||Pacific-Rim Objective Measurement Symposium (PROMS), (Preconference workshops July 23-24, 2018) Fudan University, Shanghai, China "Applying Rasch Measurement in Language Assessment and across the Human Sciences" www.promsociety.org|
|Aug. 10 - Sept. 7, 2018, Fri.-Fri.||On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com|
|Sept. 3 - 6, 2018, Mon.-Thurs.||IMEKO World Congress, Belfast, Northern Ireland www.imeko2018.org|
|Oct. 12 - Nov. 9, 2018, Fri.-Fri.||On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com|
Our current URL is www.winsteps.com
Winsteps® is a registered trademark
|Mike L.'s Wellness Report: Effective weight loss program? The Mediterranean Diet, especially the M3 version|