Four-facet rating scale with bias analysis: Essays

Up  Previous  Next

A carefully conducted grading of English essays written for the Advanced Placement Program was conducted by H.I. Braun (Understanding score reliability: Experiments in calibrating essay readers. Journal of Educational Statistics, Spring 1988, 13/1, 1-18). The four facets are examinee, essay topic, reader, and grading session. In this analysis, bias interactions between essay topic and reader are reported. Also interactions are reported for reader by grading session, which is entered as a "dummy" facet, with all elements anchored at 0.

 

Facets specifications and data (in file Essays.txt):

 

; this is file essays.txt

title = AP English Essays (College Board/ETS)

convergence = 0.1 ; size of largest remaining marginal score residual at convergence

unexpected = 3.0 ; size of smallest standardized residual to report

arrange = M ; arrange output tables in Measure ascending order

facets = 4 ; there are 4 facets in this analysis

noncenter = 1 ; examinee facet floats

positive = 1 ; for examinees, greater score = greater measure

Inter-rater = 3 ; facet 3 is the rater facet

usort = 2,3,1 ; sort residuals by 2=Essay, 3=Reader, 1=Examinee

Model=

?,?B,?B,?,R9 ; observations are ratings in range 1-9.

 ; look for interaction/bias between reader and essay type

?,?,?B,?B,R9 ; look for rater x grading session interaction

*

Labels=

1,examinee

1-32 ; 32 otherwise anonymous examinees

*

2,Essay

1,A ; 3 essays

2,B

3,C

*

3,Reader

1-12 ; 12 otherwise anonymous readers

*

4,Session,A ; this is a dummy facet, used only for investigating interactions

11,day 1 time 1 ,0 ; 8 sessions - all anchored at 0

12,day 1 time 2 ,0

21,day 2 time 1 ,0

22,day 2 time 2 ,0

31,day 3 time 1 ,0

32,day 3 time 2 ,0

41,day 4 time 1 ,0

42,day 4 time 2 ,0

*

data =

05,1,1,11,4 ; first rating: examinee 5, essay 1, reader 1, session 11, rating of 4

| ; more data

09,1,1,11,3 ; last rating


Help for Facets Rasch Measurement Software: www.winsteps.com.