Rater misbehavior

The fit statistics in Facets help us to detect many types of misbehavior. For instance, central tendency usually makes raters too predictable, so that their infit and outfit mean-square statistics are noticeably less than 1.0. Also, if you model each rater to have a unique rating scale, by using # instead of ? for the rater facet in the Models= specification, then you will see in Table 8 that the rater has an unusually high number of ratings in the central categories.

Is there too much rater bias identified? Are there are some persons with idiosyncratic profiles who should be eliminated before the rater bias analysis is taken too seriously?

You also need to identify how big the bias has to be before it makes a substantive difference. Perhaps "Obs-Exp Average Difference" needs to be at least 0.5 score-points.

Then you have to decide what type of rater agreement you want.

Do you want the raters to agree exactly with each other on the ratings awarded? The "rater agreement %".

Do you want the raters to agree about which performances are better and which are worse? Correlations

Do you want the raters to have the same leniency/severity? "1 - Separation Reliability" or "Fixed Chi-squared"

Do you want the raters to behave like independent experts? Rasch fit statistics

Numerous types of rater misbehavior are identified in the literature. Here are some approaches to identifying them. Please notify us if you discover useful ways to identify misbehavior.

A suggested procedure:

(a) Model all raters to share a common understanding of the rating scale:

Models = ?,?,?,R9 ; the model for your facets and rating scale

Interrater= 2 ; 2 or whatever is the number of your rater facet

In the rater facet report (Table 7):

How much difference in rater severity/leniency is reported? Are there outliers?

Are rater fit statistics homogeneous?

Does inter-rater agreement indicate "scoring machines" or "independent experts"?

In the rating scale report (Table 8):

Is overall usage of the categories as expected?

(b) Model each rater to have a personal understanding of the rating scale:

Models = ?,#,?,R9 ; # marks the rater facet

Interrater = 2 ; 2 or whatever is the number of your rater facet

In the rating scale report (Table 8):

For each rater: is overall usage of the categories as expected?

Are their specific problems, e.g., high or low frequency categories. unobserved categories, average category measures disordered?

Models =

?,?B,?,?B,R9 ; Facet 4 is a demographic facet (e.g., gender, sex): rater-gender interaction (bias)

?,?B,?B,?,R9 ; Facet 3 is the items: rater-item interaction (bias)

In the bias/interaction report (Table 14):

Are any raters showing large and statistically significant interactions?

Known rater misbehaviors:

1. Leniency/Severity/Generosity.

This is usually parameterized directly in the "Rater" facet, and measures are automatically adjusted for it.

2. Extremism/Central Tendency.

Tending to award ratings in the extreme, or in the central, categories.

This can be identified by modeling each rater to have a separate rating scale (or partial credit). Those with very low central probabilities exhibit extremism. Those with very high central probabilities exhibit central tendency.

3. Halo/"Carry Over" Effects.

One attribute biases ratings with respect to other attributes. This requires that we know the order in which ratings are assigned for each person. If we know this, then we can measure all the persons and raters using only the rating of the first item rated. Then anchor everything, including anchoring the other items at the difficulty of the first item. In this anchored analysis of all the data, the raters with the lowest mean-squares are the ones most likely to have a halo effect.

4. Response Sets.

The ratings are not related to the ability of the subjects.

Anchor all persons at the same ability, usually 0. Raters who best fit this situation are most likely to be exhibiting response sets.

5. Playing it safe.

The rater attempts to give a rating near the other raters, rather than independently.

Specify the Inter-rater= facet and monitor the "agreement" percentage. The rater also tends to overfit.

6. Instability.

Rater leniency changes from situation to situation.

Include the "situation" as a dummy facet (e.g., rating session), and investigate rater-situation interactions using "B" in the Models= statements.

Help for Facets Rasch Measurement and Rasch Analysis Software: www.winsteps.com Author: John Michael Linacre.

Rasch Books and Publications
Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, 2nd Edn, 2024 George Engelhard, Jr. & Jue Wang	Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene	Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver	Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone	Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale
Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes	Statistical Analyses for Language Testers (Facets), Rita Green	Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind	Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M	Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland
Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind	Rasch Measurement: Applications, Khine	Winsteps Tutorials - free Facets Tutorials - free	Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre	Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan
Other Rasch-Related Resources: Rasch Measurement YouTube Channel
Rasch Measurement Transactions & Rasch Measurement research papers - free	An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse	Rasch Measurement Theory Analysis in R, Wind, Hua	Applying the Rasch Model in Social Sciences Using R, Lamprianou	El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar	Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch	Rasch Models for Measurement, David Andrich	Constructing Measures, Mark Wilson	Best Test Design - free, Wright & Stone Rating Scale Analysis - free, Wright & Masters
Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias	Diseño de Mejores Pruebas - free, Spanish Best Test Design	A Course in Rasch Measurement Theory, Andrich, Marais	Rasch Models in Health, Christensen, Kreiner, Mesba	Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen
As an Amazon Associate I earn from qualifying purchases. This does not change what you pay.

Rasch Books and Publications

Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, 2nd Edn, 2024 George Engelhard, Jr. & Jue Wang

Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene

Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver

Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone

Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale

Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes

Statistical Analyses for Language Testers (Facets), Rita Green

Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind

Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M

Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland

Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind

Rasch Measurement: Applications, Khine

Winsteps Tutorials - free
Facets Tutorials - free

Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre

Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan

Other Rasch-Related Resources: Rasch Measurement YouTube Channel

Rasch Measurement Transactions & Rasch Measurement research papers - free

An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse

Rasch Measurement Theory Analysis in R, Wind, Hua

Applying the Rasch Model in Social Sciences Using R, Lamprianou

El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.

Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar

Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch

Rasch Models for Measurement, David Andrich

Constructing Measures, Mark Wilson

Best Test Design - free, Wright & Stone
Rating Scale Analysis - free, Wright & Masters

Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias

Diseño de Mejores Pruebas - free, Spanish Best Test Design

A Course in Rasch Measurement Theory, Andrich, Marais

Rasch Models in Health, Christensen, Kreiner, Mesba

Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen

As an Amazon Associate I earn from qualifying purchases. This does not change what you pay.

To receive News Emails about Winsteps and Facets by subscribing to the Winsteps.com email list, enter your email address here: I want to Subscribe: & click below I want to Unsubscribe: & click below Please set your SPAM filter to accept emails from Winsteps.com The Winsteps.com email list is only used to email information about Winsteps, Facets and associated Rasch Measurement activities. Your email address is not shared with third-parties. Every email sent from the list includes the option to unsubscribe.
Questions, Suggestions? Want to update Winsteps or Facets? Please email Mike Linacre, author of Winsteps mike@winsteps.com

Questions, Suggestions? Want to update Winsteps or Facets? Please email Mike Linacre, author of Winsteps mike@winsteps.com

Coming Rasch-related Events
Jan. 17 - Feb. 21, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Feb. - June, 2025	On-line course: Introduction to Classical Test and Rasch Measurement Theories (D. Andrich, I. Marais, RUMM2030), University of Western Australia
Feb. - June, 2025	On-line course: Advanced Course in Rasch Measurement Theory (D. Andrich, I. Marais, RUMM2030), University of Western Australia
Apr. 21 - 22, 2025, Mon.-Tue.	International Objective Measurement Workshop (IOMW) - Boulder, CO, www.iomw.net
May 16 - June 20, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 20 - July 18, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com
Oct. 3 - Nov. 7, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com

Coming Rasch-related Events

Jan. 17 - Feb. 21, 2025, Fri.-Fri.

On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com

Feb. - June, 2025

On-line course: Introduction to Classical Test and Rasch Measurement Theories (D. Andrich, I. Marais, RUMM2030), University of Western Australia

Feb. - June, 2025

On-line course: Advanced Course in Rasch Measurement Theory (D. Andrich, I. Marais, RUMM2030), University of Western Australia

Apr. 21 - 22, 2025, Mon.-Tue.

International Objective Measurement Workshop (IOMW) - Boulder, CO, www.iomw.net

May 16 - June 20, 2025, Fri.-Fri.

On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com

June 20 - July 18, 2025, Fri.-Fri.

On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com

Oct. 3 - Nov. 7, 2025, Fri.-Fri.

On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com