Mantel and Mantel-Haenszel DIF statistics

Top Up Down  A A

Differential item functioning (DIF) an be investigated using log-odds estimators, Mantel-Haenszel (1959) for dichotomies or Mantel (1963) for polytomies. The sample is divided into difference classification groups (also called reference groups and focal groups) which are shown in Table 30 and specified with DIF=. And then sliced into strata by ability measure (equivalent to raw score for complete data).

 

M-H and the t-tests in Winsteps should produce the same results, because they are based on the same logit-linear theory. But, in practice, M-H will be more accurate if the data are complete and there are large numbers of subjects at every score level, so called "thin" matching. Under other circumstances, M-H may not be estimable, or must use grouped-score "thick" matching, in which case the t-test method will probably be more accurate. Similar conclusions can also be inferred from http://www.eric.ed.gov/ERICWebPortal/custom/portlets/recordDetails/detailmini.jsp?_nfpb=true&_&ERICExtSearch_SearchValue_0=ED334230&ERICExtSearch_SearchType_0=no&accno=ED334230

 

MHSLICE= controls the width of each slice, thin or thick. MHSLICE= specifies the width of the slice (in logits) of the latent variable be included in each cross-tab. The lower end of the lowest slice is always the lowest observed person measure.

 

MHSLICE = 0 bypasses Mantel-Haenszel or Mantel computation.

 

MHSLICE = .1 logits and smaller. The latent variable is stratified into thin slices. This corresponds to the slicing by raw scores with complete data

 

MHSLICE = 1 logit and larger. The latent variable is stratified into thick slices.

 

For each slice, a cross-tabulation is constructed for each pair of person classifications against each scored response level. An odds-ratio is computed from the cross-tab. Zero and infinite ratios are ignored. A homogeneity chi-square is also computed when possible.

 

Thin slices are more sensitive to small changes in item difficulty across person classifications, but more persons are ignored in inestimable cross-tabs. Thick slices are more robust because fewer persons are ignored. Use the Specification pull-down menu to set different values of MHSLICE= and then produce the corresponding Table 30.

 

In principle, when the data fit the Rasch model, the Mantel and Mantel-Haenszel estimators should concur with the Rasch DIF contrast measures. The Rasch DIF contrast weights each person equally. Mantel weights each cross-tabulation equally. Thus when the DIF estimates disagree, it indicates that the DIF in the data is non-uniform with ability level.

 

Computation:

Person classification groups are A, B, ... They are compared pairwise. Starting from the lowest person measure, each slice is MHSLICE= logits wide. There are K slices up through the highest person measure. For the target item, in the kth slice and comparing classification groups A and B, with categories renumbered from 0 to simplify the computation,

 

For slice k
of the target item

Counts

Summed scores
on this item

Summed squared-scores
on this item

Classification group A

ACk

ASk

AQk

Classification group B

BCk

BSk

BQk

Summed: Both groups

ABCk

ABSk

ABQk

 

Then the Mantel or Mantel-Haenszel DIF chi-square for the target item is:

For dichotomous items, the Mantel-Haenszel logit DIF size estimate for a dichotomous item is summed across estimable slices:

For polytomous items using adjacent, transitional, sequential odds, the Mantel logit DIF size estimate becomes:

where ACjk is the Count of responses by classification group A in category j of slice k. α is the odds-ratio.

 

Mantel N. (1963) Chi-square tests with one degree of freedom: extensions of the Mantel-Haenszel procedure. J Amer Stat Assoc 58, 690-700.

Mantel, N. and Haenszel, W. (1959) Statistical aspects of the analysis of data from retrospective studies of disease. J Natl Cancer Inst 22, 719-748.

 

ETS DIF Category

DIF Contrast (Logits)

DIF Statistical Significance

C = moderate to large

|DIF| >=1.5 / 2.35 = 0.64

p( |DIF| ≤1/2.35 = 0.43) < .05

B = slight to moderate

|DIF| >= 1/2.35 = 0.43

p( |DIF| <0) < .05

A = negligible

 

 

C-, B- = DIF against focal group
C+, B+ = DIF against reference group

ETS (Educational Testing Service) use Delta units. 1 logit = 2.35 Delta units.

1 δ = (4/1.7) ln(α), where α is the odds-ratio.

Zwick, R., Thayer, D.T., Lewis, C. (1999) An Empirical Bayes Approach to Mantel-Haenszel DIF Analysis. Journal of Educational Measurement, 36, 1, 1-28

 

Example:

+-----------------------------------------------------------------------------------------------------+

| PERSON   DIF   DIF   PERSON   DIF   DIF      DIF    JOINT                MantelHanzl ITEM           |

| CLASS  MEASURE S.E.  CLASS  MEASURE S.E.  CONTRAST  S.E.   t  d.f. Prob. Prob.  Size Number  Name   |

|-----------------------------------------------------------------------------------------------------|

| A        1.47   .28  P        2.75   .34     -1.28   .44 -2.94 104 .0041 .0040 -1.20      1 Response|

+-----------------------------------------------------------------------------------------------------+

Size of Mantel-Haenszel slice = .100 logits

 

title="MH computation"

; d.f.=1 chi=8.3052 p=0.0040

; log-odds = 1.198

codes=01

clfile=*

1 Better

0 Same

*

item1=1

name1=1

NI=1

pweight=$s9w2 ; weighting substitutes for entering multiple records

PAFILE=$S6W1 ; anchoring forces stratification

DIF = $4W1 ; cross-tab by Gender, F or M

&end

Response

;234567890

END LABELS  

1 FA 1  16

0 FA 1  11

1 FP 1   5

0 FP 1  20

1 MA 2  12

0 MA 2  16

1 MP 2   7

0 MP 2  19


Help for WINSTEPS® Rasch Measurement Software: www.winsteps.com.