|
Four-facet rating scale with bias analysis: Essays |
Top Up Down
A A |
A carefully conducted grading of English essays written for the Advanced Placement Program was conducted by H.I. Braun (Understanding score reliability: Experiments in calibrating essay readers. Journal of Educational Statistics, Spring 1988, 13/1, 1-18). The four facets are examinee, essay topic, reader, and grading session. In this analysis, bias interactions between essay topic and reader are reported. Also interactions are reported for reader by grading session, which is entered as a "dummy" facet, with all elements anchored at 0.
Facets specifications and data (in file Essays.txt):
; this is file essays.txt title = AP English Essays (College Board/ETS) convergence = 0.1 ; size of largest remaining marginal score residual at convergence unexpected = 3.0 ; size of smallest standardized residual to report arrange = M ; arrange output tables in Measure ascending order facets = 4 ; there are 4 facets in this analysis noncenter = 1 ; examinee facet floats positive = 1 ; for examinees, greater score = greater measure Inter-rater = 3 ; facet 3 is the rater facet usort = 2,3,1 ; sort residuals by 2=Essay, 3=Reader, 1=Examinee Model= ?,?B,?B,?,R9 ; observations are ratings in range 1-9. ; look for interaction/bias between reader and essay type ?,?,?B,?B,R9 ; look for rater x grading session interaction * Labels= 1,examinee 1-32 ; 32 otherwise anonymous examinees * 2,Essay 1,A ; 3 essays 2,B 3,C * 3,Reader 1-12 ; 12 otherwise anonymous readers * 4,Session,A ; this is a dummy facet, used only for investigating interactions 11,day 1 time 1 ,0 ; 8 sessions - all anchored at 0 12,day 1 time 2 ,0 21,day 2 time 1 ,0 22,day 2 time 2 ,0 31,day 3 time 1 ,0 32,day 3 time 2 ,0 41,day 4 time 1 ,0 42,day 4 time 2 ,0 * data = 05,1,1,11,4 ; first rating: examinee 5, essay 1, reader 1, session 11, rating of 4 | ; more data 09,1,1,11,3 ; last rating |
Help for Facets Rasch Measurement Software: www.winsteps.com.