|
Table 14 is the pairwise bias report |
Up Previous Next |
This Table compares the local (biased?) relative measure of one element with the local relative measure of another element.
If Table 14 is long, it can only be output from the Output Tables menu. "Request Table 14 from the Output Tables menu."
Zscore=, Xtreme= and Arrange= control this Table. This Table presents the bias/interaction information in a pairwise format. It has the same information as Table 13. Table 14 contains several subtables, each containing the same information, but from different perspectives. Depending on how the Tables are conceptualized, they quantify item bias or differential item functioning, DIF, differential person functioning, DPF, differential rater functinging, DRF, differential task functioning, DTF, etc. Item bias and DIF are the same thing. The widespread use of "item bias" dates to the 1960's, "DIF" to the 1980's.
Here are two subtables. Brahe, a judge, has given Brahe rated Edward relatively low and Cavendish relatively high:
Table 14.3.1.1 Bias/Interaction Pairwise Report
---------------------------------------------------------------------------------------------------------------
| Target | Target Obs-Exp Context | Target Obs-Exp Context | Target Joint |
| N Junior | Measr S.E. Average N Senior sc | Measr S.E. Average N Senior sc |Contrast S.E. t d.f. Prob. |
---------------------------------------------------------------------------------------------------------------
| 4 David | .25 .29 1.54 2 Brahe | -1.05 .35 -.46 3 Cavendish | 1.30 .45 2.86 8 .0211 |
David, the target examinee is rated 1.54 score points high (.25 logits) by Brahe, and .46 score points low (-1.05 logits) by Cavendish. So David's perceived difference in performance (.25 - -1.05) is 1.30 logits, which has a p=.02 probability of happening by chance.
| 5 Edward | -.84 .36 -2.60 2 Brahe | 1.11 .36 2.00 3 Cavendish | -1.96 .51 -3.85 8 .0049 |
Edward, the target examinee is rated 2.60 score points low (-.84 logits) by Brahe, and 2.00 score points high (1.11 logits) by Cavendish. So Edward's perceived difference in performance (-.84 - 1.11) is -1.96 logits, which has a p=.00 probability of happening by chance.
| Target = | Element for which the bias or interaction is to be compared in two contexts |
N = Element number
Facet name heading with element name beneath
In one Context:
Target Measr = local measure of element in this context (includes bias) = overall measure for target (Table 7 or Table 13) + bias (Table 13)
Target S.E. = precision of local ability estimate
Obs-Exp Average = the average difference between the observed and the expected (no bias) ratings for the Target element in this context.
Context
N = Element number
Facet name heading with element name beneath
In the other Context:
Target Measr = local measure of element in this context (includes bias) = overall measure for target (Table 7 or Table 13) + bias (Table 13)
Target S.E. = precision of local ability estimate
Obs-Exp Average = the average difference between the observed and the expected (no bias) ratings for the Target element in this context.
Context
N = Element number
Facet name heading with element name beneath
| Target Contrast = | difference between the Target Measures in the two Contexts |
| Joint S.E. = | standard error of the difference |
| t = | Student's t-statistic of Contrast / S.E. |
| d.f. = | degrees of freedom of t-statistic (approximate) |
| Prob. = | probability of t-statistic assuming t, d.f. are exact. |
Here, the judges are the Contexts. They compare their perceptions of the Targets, the examinees, i.e., the bias is interpreted as Differential Person Functioning, DPF. The first line reads: Brahe (a judge, the Context) perceives David (an examinee, the Target) to have a local ability measure of .25 logits (= -.46 David's overall measure from Table 7 + .71 David's local bias size from Table 13) with a precision of .29 logits, corresponding to David performing 1.54 score points per observation better than expected. Cavendish (a judge, the Context) perceives David (an examinee, the Target) to have an ability measure of -1.05 logits, performing .46 score points per observation worse than expected. The ability difference is 1.30 logits. Statistically, this difference has a t of 2.86 with 8 d.f., i.e., p=.02 for a two-sided t-test.
Table 14.3.1.2 Bias/Interaction Pairwise Report
------------------------------------------------------------------------------------------------------------
| Target | Target Obs-Exp Context | Target Obs-Exp Context | Target Joint |
| N Senior sc | Measr S.E. Average N Junior | Measr S.E. Average N Junior |Contrast S.E. t d.f. Prob. |
------------------------------------------------------------------------------------------------------------
| 2 Brahe | -.48 .29 1.54 4 David | 1.50 .36 -.66 5 Edward | -1.98 .46 -4.28 8 .0027 |
| 3 Cavendish | .49 .35 -1.13 4 David | -.78 .36 3.27 5 Edward | 1.27 .50 2.55 8 .0341 |
Here, the examinees compare their perceptions of the judges, i.e., the bias is interpreted as Differential Rater Functioning, DRF. The first line reads: David (an examinee, the Context) perceives Brahe (a judge, the Target) to have a severity of -.48 logits (= .24 Brahe's overall severity in Table 7, - .71 Brahe's local leniency bias size from Table 13) . Edward (an examinee, the Context) perceives Brahe (a judge, the Target) to have a severity measure of 1.50 logits. The difference is -1.98 logits. Statistically, this difference has a t of -4.28 with 8 d.f., i.e., p<.01 for a two-sided t-test.
Table 13 Bias/Interaction Calibration Report
-----------------------------------------------------------------------------------------------------------
| Obsvd Exp. Obsvd Obs-Exp| Bias Model |Infit Outfit| |
| Score Score Count Average| Size S.E. t | MnSq MnSq | Sq N Senior sc measr N Junior measr |
-----------------------------------------------------------------------------------------------------------
| 25 17.3 5 1.54| .71 .29 2.43 | .3 .3 | 11 2 Brahe .24 4 David -.46 |
| 15 20.6 5 -1.13| -.58 .35 -1.69 | .5 .5 | 12 3 Cavendish -.09 4 David -.46 |
Interpretation:
Observations by Brahe for David are 1.54 score points higher than expected = .71 logits more able
Observations by Cavendish for David are 1.13 score point slower than expected = -.58 logits less able.
Overall pairwise ability swing = .71 - -.58 = 1.29 logits.
Table 14.3.1.1 Bias/Interaction Pairwise Report
---------------------------------------------------------------------------------------------------------------
| Target | Target Obs-Exp Context | Target Obs-Exp Context | Target Joint |
| N Junior | Measr S.E. Average N Senior sc | Measr S.E. Average N Senior sc |Contrast S.E. t d.f. Prob. |
---------------------------------------------------------------------------------------------------------------
| 4 David | .25 .29 1.54 2 Brahe | -1.05 .35 -.46 3 Cavendish | 1.30 .45 2.86 8 .0211 |
Interpretation:
For David, the observations by Brahe are 1.54 higher than expected. This corresponds to Brahe (context) perceiving David (target) to have an ability of "David + Bias" = -.46 + .71 = .25
For David, the observations by Cavendish are .46 lower than expected. This corresponds to Cavendish (context) perceiving David (target) to have an ability of "David + Bias" = -.46 + -.58 = -1.04 = -1.05 (due to rounding)
Overall pairwise ability swing = .25 - -1.05 = 1.30 logits
Example: When "higher score = higher measure" and Bias = Ability
Target means "apply the bias to the measure of this element"
Context means "the bias observed when the target element interacts with the context element"
Obsvd Exp. Bias Model
Score Score Measure S.E. Context measr Items measr
186 177 .19 .14 red -.61 tulip -.42
234 243 -.15 .13 green .61 tulip -.42
Bias | target = tulip
Measure Context measr Items measr | bias overall Context
0.19 red -.61 tulip -.42 | .19 + -.42 = -.23 relative to red
-.15 green .61 tulip -.42 | -.15 + -.42 = -.57 relative to green
Effect of imprecision in element estimates
This computation treats the element measures as point estimates (i.e., exact). You can inflate the reported standard errors to allow for the imprecision in those measures. Formula 29 of Wright and Panchapakesan (1969), www.rasch.org/memo46.htm, applies. You will see there that, for dichotomies, the most by which imprecision in the baseline measures can inflate the variance is 25%. So, if you multiply the S.E.s reported in this Table by sqrt(1.25) = 1.12 (and divide the t by 1.12), then you will be as conservative as possible in computing the bias significance.
Help for Facets Rasch Measurement Software: www.winsteps.com.