Correlations: point-biserial, point-measure, residual

Top Up Down  A A

In Rasch analysis, we use item correlations as an immediate check that the response-level scoring makes sense. If the observed correlation is negative, something may have gone wrong (MCQ miskey, reversed survey item, etc.)

In general, correlations are much too difficult to interpret, so we switch over to using mean-squares. The "expected correlation" indicates when conventional rules such as eliminate items with point-biserials less than 0.2 are misleading.

Item correlations are difficult to interpret because they are influenced by:

1. predictability of the data

2. targeting of the item on the person sample

3. distribution of the person sample

In Rasch analysis, we are chiefly concerned about the predictability of the data when assessing item quality, so we examine the predictability directly using the mean-square statistics, rather than indirectly through the correlations.

 


 

All correlations are computed as Pearson product-moment correlation coefficients. If you wish to compute other correlations, the required data are in XFILE= IPMATRIX=, IFILE= or PFILE=. The Biserial correlation can be computed from the point-biserial.

 

In Table 14.1 and other measure Tables:

when PTBISERIAL=Yes

PTBSE is the point-biserial correlation between the responses to this item by each person and the total marginal score by each person (omitting the response to this item). This is the "corrected point-biserial". Henrysson, S. (1963). Correction for item-total correlations in item analysis. Psychometrika, 28, 211-218.

 

when PTBISERIAL=All

PTBSA is the point-biserial correlation between the responses to this item by each person and the total marginal score by each person (including the response to this item). This is the conventional point-biserial.

 

In Table 14.3 and other or distractor Tables:

when PTBISERIAL=Yes or PTBISERIAL=All

PTBSD is the distractor point-biserial correlation between the indicated response to this item (scored 1 and other responses scored 0) by each person and the total marginal score by each person.

 

There is a closer match between Table 14.1 and Table 14.3 when PTBISERIAL=All

 

PTBIS=Y or E (indicated by PTBSE): The formula for the point-biserial correlation rpbis for item i when i=1,L and persons n=1,N is the correlation between the observation for each person on item i and the total score for each person on all the items excluding item i (and similarly for the point-biserial for each person):

 

 

PTBIS=All (indicated by PTBSA):

 

Other correlations: The general form of the Pearson product-moment correlation between variables X and Y is:

 

 

where X1,..,XN are the values for one variable, and Y1,..,YN are the values for the other variable. The range is -1 to +1.

 

Under classical (raw-score) test theory conventions, point-biserial correlations should be 0.3, 0.4 or better. Under Rasch conditions, point-biserial (or point-measure) correlations should be positive, so that the item-level scoring accords with the latent variable, but the size of a positive correlation is of less importance than the fit of the responses to the Rasch model, indicated by the mean-square fit statistics.

 


 

The Expected Value of a Correlation

 

Interpreting an observed value is made easier if we can compare it with its expected value. Is is much higher than expected or much lower than expected?

 

The general formula for a Pearson correlation coefficient is shown above. Suppose that Xn is Xni the observation of person n on item i. Yn is Bn the ability of person n, then the point-measure correlation is:

 

 

According to the Rasch model, the expected value of Xni is Eni and its model variance around the expectation is Wni where Sum(Eni) = Sum(Xni) for n=1,N. Thus an estimate of the expected value of the point-measure correlation is given by the Rasch model proposition that: Xni = Eni ±Wni. Other variance terms are much smaller.

 

 

±Wni is a random residual. Its cross-product with any other variable is modeled to be zero. Thus

 

and similarly for the point-biserial correlations. Here is an example:

 

Person

Number

Observation

Person

Measure

Expectation of

Observation

Variance

1

1

-1.33

0.16

0.13

2

0

-0.03

0.41

0.24

3

0

1.33

0.73

0.20

4

1

1.33

0.73

0.20

Observed Correlation:

-0.30

 

 

Expected Correlation:

0.48

 

 

Disattenuated correlation coefficients

 

"Attenuated" means "reduced". "Disattenuation" means "remove the attenuation".

The observed correlation between two variables is attenuated (reduced toward zero) because the variables are measured with error. So, when we remove the measurement error (by a statistical operation), the resulting correlation is disattenuated. Disattenuated correlations are always further from zero.

Algebraically:

{A} and {B} are the "true" values of two variables. Their true (disattenuated) correlation is

Disattenuated ("true") correlation = r(A,B)

But the observed values of the variables are measured with error {A±a}, {B±b}, so the observed correlation is

observed correlation = r(A,B) * (var(A)*var(B))/((var(A)+a²)*(var(B)+b²)).

"Disattenuation" reverses this process.

 

If the reliability of {A} is RA, and the reliability of {B} is RB, then the disattenuated correlation between {A} and {B} is:

disattenuated r(A,B) = r(A,B) / (RA*RB).

 

 


Help for WINSTEPS® Rasch Measurement Software: www.winsteps.com.