﻿ Table 13 is the bias report

# Table 13 Bias report

Top Up Down  A A

This Table compares the local (biased?) measure of one element with its measure from the main analysis. Zscore=, Bias=, Xtreme=, Arrange= and Juxtapose= control this Table.

+------------------------------------------------------------------------------------------------------------------------+

|Observd  Expctd  Observd  Obs-Exp|  Bias  Model                    |Infit Outfit|    Senior scientists  Junior Scientis |

|  Score   Score    Count  Average|  Size   S.E.     t   d.f. Prob. | MnSq  MnSq | Sq N Senior sc  measr N Junior  measr |

|---------------------------------+---------------------------------+------------+---------------------------------------|

|    4.2     7.91     1.5    -2.47|  -1.00   .61  -1.63     1 .3496 |   .6    .6 | 14 2 Brahe        .21 5 Edward    .34 |

|---------------------------------+---------------------------------+------------+---------------------------------------|

|    7.3     7.25     1.5      .01|    .01   .56    .03             |   .6    .6 | Mean (Count: 21)                      |

|    2.1     1.25      .0      .98|    .42   .03    .72             |   .4    .4 | S.D. (Population)                     |

|    2.1     1.28      .0     1.00|    .43   .03    .73             |   .4    .4 | S.D. (Sample)                         |

+------------------------------------------------------------------------------------------------------------------------+

Fixed (all = 0) chi-square: 10.8  d.f.: 21  significance (probability): .97

--------------------------------------------------------------------------------------------------------------------------

Observd Score = raw score of the estimable responses involving these elements simultaneously, as observed in the data file.

Expctd Score = expected score based on the measures from the main analysis.

Observd Count = number of estimable responses involving these elements simultaneously.

Obs-Exp Average = observed score less the expected score divided by observed count, the bias in terms of the response metric. For rater behavior, look at the "Obs-Exp Average". If this is positive, then the rater is more lenient than expected in this situation. If this is negative, then the rater is more severe than expected.

Bias Size = Size of bias measure in log-odds units, logits, relative to overall measures. Only large or significant biases are listed greater than Zscore=. For clarification, compare the ranking of the Obs-Exp Average with that of the Bias Size. In this case, larger observed scores correspond to higher Bias sizes, i.e., higher abilities, higher leniencies, higher easiness. The sign of the report bias is controlled by Bias=

For (measure+bias), add Bias Size to the element "measr", or subtract Bias Size from "measr". Addition for persons that are locally more able. Subtraction for items that are locally more difficult. Look at the "Obs-Exp Average".

Model Error = standard error of the bias estimate.

t = Student's t-statistic testing the hypothesis "There is no bias apart from measurement error". The "Obsvd Count"-2 approximates the degrees of freedom of the t-statistic. With many observations, the t-statistic approximates a normal distribution with mean = 0, S.D. = 1, i.e., a z-score. The t-statistic is the report of a test of the statistical significance of the size of the bias. The mean-square fit statistics do not report on whether there is bias or not. They report on how much misfit there is in the data after the bias is removed. With the inclusion of bias terms, the model is overparameterized, so it is expected that the data will overfit the model. The purpose of the mean-square fit statistics is to help you determine whether the misfit in the data is explained by the bias or is due to other causes.

Infit MnSq and Outfit MnSq = Does the bias explain all the misfit or is there also another source of misfit? Values are expected to be less than 1.0 because the bias is explaining some of the overall misfit. These statistics do not report the fit of the bias terms. In effect, we are deliberately over-parameterizing the statistical model. Consequently we expect the mean-squares to be less than 1.0 (by an unknown amount). The reported mean-squares indicate how much misfit remains after the interactions are estimated. The reported mean-squares do not have the usual statistical properties of mean-squares (chi-squares) and so their statistical significance (Zstd) is unknown.

For each facet entering into the bias calculation:

Sq = a sequence number used to reference the bias term - useful for referring to a specific line in this Table.

N = element number with Facet

Senior Sc = Name of facet: elements listed below

measr = Measure of element from main analysis.

In the summary statistics,

Count =  the number of modelled bias terms found in the data.

S.D. (Population) = the standard deviation if this sample is the whole population

S.D. (Sample) = the standard deviation if this sample is a random sample from the whole population

Fixed (all=0) chi-square = A test of the "fixed effect" hypothesis: "Can this set of interactions be regarded as sharing the same measure of approximately 0.0 after allowing for measurement error?" The chi-square value and degrees of freedom (d.f.) are shown. The significance is the probability that this "fixed" hypothesis is the case. This is not a test of "Can these interactions be disregarded?" Individual interactions may be large and significant. For instance, one bad tire in a production run of 1000 tires may not indicate a "statistically significant" problem in the production process, but I still don't want it on my car!

The chi-squares and t-tests evaluate hypothesis tests.

Think of driving along the road.

A.) Is this road surface generally OK?

Answer: (in another dataset): Fixed (all = 0) chi-square: 20  d.f.: 21  significance (probability): .5

So we can't reject the hypothesis that the road surface is generally OK.

B.) Are there any pot-holes we should avoid (even if the road is generally OK)?

Answer: the t-test Prob. - in the Table above Brahe-David p=.02. This is a "pot-hole"!

Example using earlier Facets output:

The observed score is 40. The expected score is 33.6. There are 9 observations. So, on average, Group 3 is being rated 0.71 score-points higher than expected by Judge J4. This corresponds to a change in judge severity of -.70 logits = less severe. The standard error is .34 logits. So the Z-score (t-test with infinite d.f.) is -2.04. The rater is significantly less severe (more lenient) at the .05 level (double-sided). If we want to report this as a change in group ability of +.71 logits, set Bias=Positive.

Help for Facets Rasch Measurement Software: www.winsteps.com Author: John Michael Linacre.

 Facets Rasch measurement software \$149. Winsteps Rasch measurement software \$149.

 Forum Rasch Measurement Forum to discuss any Rasch-related topic

Rasch Publications
Rasch Measurement Transactions (free, online) Rasch Measurement research papers (free, online) Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch
Applying the Rasch Model 2nd. Ed., Bond & Fox (Winsteps) Best Test Design, Wright & Stone Rating Scale Analysis, Wright & Masters
Introduction to Rasch Measurement, E. Smith & R. Smith Introduction to Many-Facet Rasch Measurement, Thomas Eckes (Facets) Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr. (Facets)
Statistical Analyses for Language Testers, Rita Green (Winsteps, Facets) Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Journal of Applied Measurement
Winsteps Tutorials Facets Tutorials Rasch Discussion Groups

Coming Rasch-related Events
Apr. 25-26, 2013, Thurs.-Fri. In-person workshop: Introduction to Rasch Measurement (R. Smith, N. Bezruczko), San Francisco CA, www.jampress.org
April 27 - May 1, 2013, Sat.-Wed. AERA Annual Meeting, San Francisco, CA, www.aera.net
May 3, 2013, Fri. ORVOMS: Ohio River Valley Objective Measurement Seminar, Lexington, Kentucky, Announcement
May 15-17, 2013, Wed.-Fri. In-person workshop: Introductory Rasch (A. Tennant, RUMM), Leeds, UK, www.leeds.ac.uk/medicine/rehabmed/psychometric
May 20-22, 2013, Mon.-Wed. In-person workshop: Intermediate Rasch (A. Tennant, RUMM), Leeds, UK, www.leeds.ac.uk/medicine/rehabmed/psychometric
May 31 - June 28, 2013, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 7, 2013, Fri. SPHERE workshop: Response-Shift and subjective measures in health science, Nantes, France, www.sphere-nantes.fr/
June 19-21, 2013, Wed.-Fri. SIS 2013 Conference on Advances in Latent Variables: Methods, Models and Applications, Brescia, Italy, meetings.sis-statistica.org/index.php/sis2013/ALV
July 1 - Nov. 30, 2013, Mon.-Sun. Online Course: Introduction to Rasch Measurement Theory (D. Andrich, RUMM), uwa.edu.au
July 5 - Aug. 2, 2013, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
Aug.1-5, 2013, Thur.-Mon. TERA-PROMS Annual Meeting, Kaohsiung, Taiwan, tera.education.nsysu.edu.tw
Aug. 9 - Sept. 6, 2013, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
Aug. 22, 2013, Thursday. Symposium in honor of Svend Kreiner, Copenhagen, Denmark, biostat.ku.dk/kreinersymposium
Sept. 4-6, 2013, Wed.-Fri. IMEKO TC1-TC7-TC13 Symposium: Measurement Across Physical and Behavioural Sciences, Genoa, Italy, www.imeko-genoa-2013.it
Sept. 13 - Oct. 11, 2013, Fri.-Fri. On-line workshop: Rasch Applications in Clinical Assessment, Survey Research, and Educational Measurement (W.P. Fisher), www.statistics.com
Sept. 18-20, 2013, Wed.-Fri. In-person workshop: Introductory Rasch (A. Tennant, RUMM), Leeds, UK, www.leeds.ac.uk/medicine/rehabmed/psychometric
Sept. 23-25, 2013, Mon.-Wed. In-person workshop: Intermediate Rasch (A. Tennant, RUMM), Leeds, UK, www.leeds.ac.uk/medicine/rehabmed/psychometric
Sept. 26-27, 2013, Thurs.-Fri. In-person workshop: Advanced Rasch (A. Tennant, RUMM), Leeds, UK, www.leeds.ac.uk/medicine/rehabmed/psychometric
Oct. 18 - Nov. 15, 2013, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Oct. 20 - Oct. 25, 2013, Sun.-Fri. International Association for Educational Assessment (IAEA) 39th Annual Conference, Tel Aviv, Israel, www.iaea-2013.com
Dec. 11-13, 2013, Wed.-Fri. In-person workshop: Introductory Rasch (A. Tennant, RUMM), Leeds, UK, www.leeds.ac.uk/medicine/rehabmed/psychometric
March 12-14, 2014, Wed.-Fri. In-person workshop: Introductory Rasch (A. Tennant, RUMM), Leeds, UK, www.leeds.ac.uk/medicine/rehabmed/psychometric
May 14-16, 2014, Wed.-Fri. In-person workshop: Introductory Rasch (A. Tennant, RUMM), Leeds, UK, www.leeds.ac.uk/medicine/rehabmed/psychometric
May 19-21, 2013, Mon.-Wed. In-person workshop: Intermediate Rasch (A. Tennant, RUMM), Leeds, UK, www.leeds.ac.uk/medicine/rehabmed/psychometric
July 4 - Aug. 1, 2014, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
Aug. 8 - Sept. 5, 2014, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
Sept. 10-12, 2014, Wed.-Fri. In-person workshop: Introductory Rasch (A. Tennant, RUMM), Leeds, UK, www.leeds.ac.uk/medicine/rehabmed/psychometric
Sept. 12 - Oct. 10, 2014, Fri.-Fri. On-line workshop: Rasch Applications in Clinical Assessment, Survey Research, and Educational Measurement (W.P. Fisher), www.statistics.com
Sept. 15-17, 2014, Mon.-Wed. In-person workshop: Intermediate Rasch (A. Tennant, RUMM), Leeds, UK, www.leeds.ac.uk/medicine/rehabmed/psychometric
Sept. 18-19, 2014, Thurs.-Fri. In-person workshop: Advanced Rasch (A. Tennant, RUMM), Leeds, UK, www.leeds.ac.uk/medicine/rehabmed/psychometric