|
Four-facet rating scale with bias analysis: Essays |
Up Previous Next |
A carefully conducted grading of English essays written for the Advanced Placement Program was conducted by H.I. Braun (Understanding score reliability: Experiments in calibrating essay readers. Journal of Educational Statistics, Spring 1988, 13/1, 1-18). The four facets are examinee, essay topic, reader, and grading session. In this analysis, bias interactions between essay topic and reader are reported. Also interactions are reported for reader by grading session, which is entered as a "dummy" facet, with all elements anchored at 0.
Facets specifications and data (in file Essays.txt):
; this is file essays.txt
title = AP English Essays (College Board/ETS)
convergence = 0.1 ; size of largest remaining marginal score residual at convergence
unexpected = 3.0 ; size of smallest standardized residual to report
arrange = M ; arrange output tables in Measure ascending order
facets = 4 ; there are 4 facets in this analysis
noncenter = 1 ; examinee facet floats
positive = 1 ; for examinees, greater score = greater measure
Inter-rater = 3 ; facet 3 is the rater facet
usort = 2,3,1 ; sort residuals by 2=Essay, 3=Reader, 1=Examinee
Model=
?,?B,?B,?,R9 ; observations are ratings in range 1-9.
; look for interaction/bias between reader and essay type
?,?,?B,?B,R9 ; look for rater x grading session interaction
*
Labels=
1,examinee
1-32 ; 32 otherwise anonymous examinees
*
2,Essay
1,A ; 3 essays
2,B
3,C
*
3,Reader
1-12 ; 12 otherwise anonymous readers
*
4,Session,A ; this is a dummy facet, used only for investigating interactions
11,day 1 time 1 ,0 ; 8 sessions - all anchored at 0
12,day 1 time 2 ,0
21,day 2 time 1 ,0
22,day 2 time 2 ,0
31,day 3 time 1 ,0
32,day 3 time 2 ,0
41,day 4 time 1 ,0
42,day 4 time 2 ,0
*
data =
05,1,1,11,4 ; first rating: examinee 5, essay 1, reader 1, session 11, rating of 4
| ; more data
09,1,1,11,3 ; last rating
Help for Facets Rasch Measurement Software: www.winsteps.com.