Estimation methods: JMLE, PROX, WMLE, CMLE, PMLE, AMLE

The Joint Maximum Likelihood Estimation (JMLE) equations in Winsteps are similar to www.rasch.org/rmt/rmt122q.htm for dichotomies, and www.rasch.org/rmt/rmt102t.htm for polytomies, enhanced to allow estimation of both person abilities and item difficulties simultaneously.

 

The dichotomous estimation equations are implemented in the Excel spreadsheet at www.rasch.org/moulton.htm and the polytomous estimation equations are implemented in the Excel spreadsheet at www.rasch.org/poly.xls

 


 

Comparison of Estimation Methods: Linacre (2022) compares the item estimates produced by R Statistics packages (including eRm, TAM, ltm, pairwise, sirt), Winsteps and Facets using three datasets. In this comparison, there estimation methods include CMLE, JMLE, MMLE and PMLE. Each set of item estimates has its own logit scaling due to estimation method, convergence criteria, niceties of implementation and local constraints. After standardizing the logit scales, the estimates for CMLE, JMLE and MMLE coincide. Consequently for any application in which logit scales are transformed into user-friendly scales, all these estimation methods are equivalent for the items (and probably the persons). After standardizing, PMLE estimates were generally close to the other estimates but noticeably different for some items. Upon further inspection, it was seen that the uneven use of the observations by PMLE can produce effective item p-values that differ from the raw-score p-values used by all the other estimation methods. This can bias the PMLE item estimates relative to all the other estimation methods.

 

Linacre J.M. (2022) R Statistics: survey and review of packages for the estimation of Rasch models, Int J Med Educ. 2022; 13:171-175; Published 24/06/2022


 

Winsteps implements two methods of estimating Rasch parameters from ordered qualitative observations: JMLE and PROX. Estimates of the Rasch measures are obtained by iterating through the data. Initially all unanchored parameter estimates (measures) are set to zero. Then the PROX method is employed to obtain rough estimates. Each iteration through the data improves the PROX estimates until they are usefully good. Then those PROX estimates are the initial estimates for JMLE which fine-tunes them, again by iterating through the data, in order to obtain the final JMLE estimates. The iterative process ceases when the convergence criteria are met. These are set by MJMLE=, CONVERGE=, LCONV= and RCONV=. Depending on the data design, this process can take hundreds of iterations (Convergence: Statistics or Substance?). When only rough estimates are needed, force convergence by pressing Ctrl+F or by selecting "Finish iterating" on the File pull-down menu.

 

Extreme scores: (perfect, maximum possible scores, and zero, minimum possible scores) are dropped from the main estimation procedure. Their measures are estimated separately using EXTRSC=.

 

Missing data: most Rasch estimation methods do not require that missing data be imputed, or that there be case-wise or list-wise omission of data records with missing data. For datasets that accord with the Rasch model, missing data lower the precision of the measures and lessen the sensitivity of the fit statistics, but do not bias the measure estimates.

 

Likelihood: Using the current parameter estimates (Rasch measures), the probability of observing each data point is computed, assuming the data fit the model. The probabilities of all the data points are multiplied together to obtain the likelihood of the entire data set. The parameter estimates are then improved (in accordance with the estimation method) and a new likelihood for the data is obtained. The values of the parameters for which the likelihood of the data has its maximum are the "maximum likelihood estimates" (Ronald A. Fisher, 1922).

 


 

JMLE "Joint Maximum Likelihood Estimation" is also called UCON, "Unconditional maximum likelihood estimation". It was devised by Wright & Panchapakesan, www.rasch.org/memo46.htm. In this formulation, the estimate of the Rasch parameter (for which the observed data are most likely, assuming those data fit the Rasch model) occurs when the observed raw score for the parameter matches the expected raw score. "Joint" means that the estimates for the persons (rows) and items (columns) and rating scale structures (if any) of the data matrix are obtained simultaneously. The iterative estimation process is described at Iteration.

 

Advantages - these are implementation dependent, and are implemented in Winsteps:

(1) independence from specific person and item distributional forms.

(2) flexibility with missing data

(3) the ability to analyze test lengths and sample sizes of any size

(4) symmetrical analysis of person and item parameters so that transposing rows and columns does not change the estimates

(5) flexibility with person, item and rating scale structure anchor values

(6) flexibility to include different variants of the Rasch model in the same analysis (dichotomous, rating scale, partial credit, etc.)

(7) unobserved intermediate categories of rating scales can be maintained in the estimation with exact probabilities.

(8) all non-extreme score estimable (after elimination of extreme scores and rarely-observed Guttman subsets)

(9) all persons with the same total raw scores on the same items have the same measures; all items with the same raw scores across the same persons have the same measures.

 

Disadvantages:

(11) measures for extreme (zero, perfect) scores for persons or items require post-hoc estimation.

(12) estimates are statistically inconsistent. Infinite data produces usually only slightly statistically-incorrect estimates. This is seen as estimation bias for finite samples..

(13) estimation bias, particularly with small samples or short tests, inflates the logit distance between estimates. The estimation bias is minescule for large datasets, and almost always less that the standard error of the estimates. The measure-order of the estimates is correct. Estimation bias is easy to correct when required. STBIAS=.

(14) chi-squares reported for fit tests (particularly global fit tests) may be somewhat inflated, exaggerating misfit to the Rasch model to a very small degree.

 

Comment on (8): An on-going debate is whether measures should be adjusted up or down based on the misfit in response patterns. With conventional test scoring and Rasch JMLE, a lucky guess counts as a correct answer exactly like any other correct answer. Unexpected responses can be identified by fit statistics. With the three-parameter-logistic item-response-theory (3-PL IRT) model, the score value of an unexpected correct answer is diminished whether it is a lucky guess or due to special knowledge. In Winsteps, responses to off-target items (the locations of lucky guesses and careless mistakes) can be trimmed with CUTLO= and CUTHI=, or be diminished using TARGET=Yes.

 

Comment on (13): JMLE exhibits some estimation bias in small data sets, but this rarely exceeds the precision (model standard error of measurement, SEM) of the measures. Estimation bias is only of concern when exact probabilistic inferences are to be made from short tests or small samples. Estimation bias can be exactly corrected for paired-comparison data with PAIRED=Yes. For other data, It can be approximately corrected with STBIAS=Yes, but, in practice, this is not necessary (and sometimes not advisable).

 

JMLE estimation details:

Newton-Raphson is an all-purpose estimation procedure and relatively easy to implement, so Winsteps once used this method. However, Newton-Raphson works best when there is a clear maximum in the likelihood function. This usually happens with complete data with dichotomies, but the story is different for long rating scales, partial credit, and incomplete data. Then we can get several local maxima, and Newton-Raphson has difficulty choosing between them. This situation became more common as Winsteps was applied to datasets that were not originally envisioned. For instance, it was analysis of DNA strings that motivated the increase in items in Winsteps to 60,000.

 

Rasch does have a useful feature. The underlying functions are all monotonic logistic curves. We can take advantage of this to refine the estimation process. So, in Winsteps, the estimation process for each parameter is like this:

 

step 1.

current value of the parameter estimate -> compute expected score

current value of the parameter estimate + a little bit -> compute expected score

compute the logistic ogive between the two current values and the two expected scores

from the logistic ogive, predict the parameter value that matches the observed score. This is the new current value for this parameter. Do this once.

 

step 2. Then do the same thing for the next parameter and all the other parameters.

 

step 3. return to step 1 while the biggest change in any parameter value is big, or the biggest difference between any observed and expected score is big. "Big" is defined by the convergence criteria.

 

step 4. the estimates have converged. We have the best values of the parameters. Rejoice!

 


PROX is the Normal Approximation Algorithm devised by Cohen (1979). This algorithm capitalizes on the similar shapes of the logistic and normal ogives. It models both the persons and the items to be normally distributed. The variant of PROX implemented in Winsteps allows missing data. The form of the estimation equations is:

 Ability of person = Mean difficulty of items encountered +

  log ( (observed score - minimum possible score on items encountered) /

   (maximum possible score on items encountered - observed score) )

  * square-root ( 1 + (variance of difficulty of items encountered) / 2.9 )

 

In Winsteps, PROX iterations cease when the variance of the items encountered does not increase substantially from one iteration to the next.

 

Advantages - these are implementation dependent, and are implemented in Winsteps:

(2)-(9) of JMLE

Computationally the fastest estimation method.

 

Disadvantages

(1) Person and item measures assumed to be normally distributed.

(11)-(14) of JMLE

 


 

AMLE is Anchored Maximum Likelihood Estimation. It is also called MLE, Maximum Likelihood Estimatin. The items or persons, along with the Andrich thresholds for polytomies, are anchored at pre-set, fixed, measures, then the person or item measures are estimated. It is described at https://www.rasch.org/rmt/rmt122q.htm

 


Other estimation methods in common use (but not implemented in Winsteps):

 

Gaussian least-squares finds the Rasch parameter values which minimize the overall difference between the observations and their expectations, Sum((Xni - Eni)²) where the sum is overall all observations, Xni is the observation when person encounters item i, and Eni is the expected value of the observation according to the current Rasch parameter estimates. For Effectively, off-target observations are down-weighted, similar to TARGET=Yes in Winsteps.

 

Minimum chi-square finds the Rasch parameter values which minimize the overall statistical misfit of the data to the model, Sum((Xni - Eni)² / Vni) where Vni is the modeled binomial or multinomial variance of the observation around its expectation. Effectively off-target observations are up-weighted to make them less improbable.

 

Gaussian least-squares and Minimum chi-square:

Advantages - these are implementation dependent:

(1)-(8) All those of JMLE.

 

Disadvantages:

(9) persons with the same total raw scores on the same items generally have different measures; items with the same raw scores across the same persons generally have different measures.

(11)-(13) of JMLE

(14) global fit tests uncertain.

 

CMLE. Conditional maximum likelihood estimation. Item difficulties are structural parameters. Person abilities are incidental parameters, conditioned out for item difficulty estimation by means of their raw scores. The item difficulty estimates are those that maximize the likelihood of the data given the person raw scores and assuming the data fit the model. The item difficulties are then used for person ability estimation using a JMLE approach.

 

Advantages  - these are implementation dependent:

(1), (6)-(9) of JMLE

(3) the ability to analyze person sample sizes of any size

(5) flexibility with item and rating scale structure anchor values

(12) statistically-consistent item estimates

(13) minimally estimation-biased item estimates

(14) exact global fit statistics

 

Disadvantages:

(2) limited flexibility with missing data

(3) test length severely limited by mathematical precision of the computer

(4) asymmetric analysis of person and item parameters so that transposing rows and columns changes the estimates

(5) no person anchor values

(11) of JMLE

(13) estimation-biases of person estimates small but uncertain

 

Here is how CMLE works. Let's assume dichotomous data, no missing data and no weighting ...

 

1) start with any set of item difficulties and all person abilities are zero logits

2) count the number of people with each observed raw score (drop extreme scores)

3) compute the probability of success and failure on each item according to the Rasch model

4) for each raw score, compute the likelihood (product of probabilities) of every way of making that raw score. For instance, if there are K items, then there are K ways of making a raw score of 1. each has K-1 failures and 1 success.

5) the expected score in each item-score cell = sum of Likelihoods of response strings with success on the item for the raw score / sum of all the Likelihoods for the raw score. This computation eliminates the person ability ("conditions out" the ability).

5) after computing the expected score in every cell of the item-score matrix, sum the expected scores on each item for each raw score (multiplied by number of persons with that raw score).

6) the summed expected scores are the current expected item raw score.

7) for each item compare the current expected item raw score with the observed item raw score, and adjust the item difficulty accordingly (using, for instance, Newton-Raphson).

8) repeat from 3) until all expected item scores approximate observed item scores.

9) with the item difficulties, compute the person abilities using AMLE.

 

QCMLE. Quasi-Conditional Maximum Likelihood Estimation. This estimates the CMLE values from the JMLE probability matrix. The marginal totals of the CMLE and JMLE probability matrices are the same, and the cell values are similar. QCMLE gives a useful indication of the difference between JMLE and CMLE estimates, and so the size of the estimation bias in JMLE estimates. This is usually small to negligible.

 

EAP. Expected A Posteriori estimation derives from Bayesian statistical principles. This requires assumptions about the expected parameter distribution. An assumption is usually normality, so EAP estimates are usually more normally distributed than Winsteps estimates (which are as parameter-distribution-free as possible). EAP is not implemented in Winsteps.

 

MMLE. Marginal maximum likelihood estimation. Item difficulties are structural parameters. Person abilities are incidental parameters, integrated out for item difficulty estimation by imputing a person measure distribution. The item difficulties are then used for person ability estimation using a JMLE approach.

 

Advantages  - these are implementation dependent:

(3), (6)-(9) of JMLE

(1) independence from specific item distributional forms.

(2) flexibility with missing data extends to minimal length person response strings

(5) flexibility with item and rating scale structure anchor values

(11) extreme (zero, perfect) scores for persons are used for item estimation.

(12) statistically-consistent item estimates

(13) minimally estimation-biased item estimates

(14) exact global fit statistics

 

Disadvantages:

(1) specific person distribution required

(4) asymmetric analysis of person and item parameters so that transposing rows and columns changes the estimates

(5) no person anchor values

(11) measures for extreme (zero, perfect) scores for specific persons or items require post-hoc estimation.

(13) estimation-biased of person estimates small but uncertain

 

PMLE. Pairwise maximum likelihood estimation. Person abilities are incidental parameters, conditioned out for item difficulty estimation by means of pairing equivalent person observations. The item difficulties are then used for person ability estimation using a JMLE approach.

 

Advantages  - these are implementation dependent:

(1), (3), (6), (7) of JMLE

(5) flexibility with item and rating scale structure anchor values

(8) all persons with the same total raw scores on the same items have the same measure

(12) statistically-consistent item estimates

 

Disadvantages:

(11) of JMLE

(2) reduced flexibility with missing data

(4) asymmetric analysis of person and item parameters so that transposing rows and columns changes the estimates

(5) no person anchor values

(8) items with the same total raw scores across the same persons generally have different measures.

(13) estimation-bias of item and person estimates small but uncertain

(14) global fit tests uncertain.

(15) uneven use of data in estimation renders standard errors and estimates less secure

 


 

Thomas Warm's (1989) Weighted Mean Likelihood Estimation WMLE

 

WMLE (also called WLE) estimates, reported in IFILE= and PFILE=, are usually slightly more central than Winsteps estimates. Standard MLE estimates of any type are the maximum values of the likelihood function and so statistical modes. Thomas Warm shows that the likelihood function is skewed, leading to an additional source of estimation bias. The mean likelihood estimate is less biased then the maximum likelihood estimate. Warm suggests an unbiasing correction that can be applied, in principle, to any MLE method, but there are computational constraints. Even when feasible, this fine tuning appears to be less than the relevant standard errors and have little practical benefit. The WMLE procedure can over-correct for the estimation bias in measures estimated from almost-extreme scores or very few observations.

 

Cohen Leslie. (1979) Approximate Expressions for Parameter Estimates in the Rasch Model, The British Journal of Mathematical and Statistical Psychology, 32, 113-120

 

Fisher R.A. On the mathematical foundations of theoretical statistics. Proc. Roy. Soc. 1922 Vol. CCXXII p. 309-368

 

Warm T.A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54, 427-450


Help for Winsteps Rasch Measurement and Rasch Analysis Software: www.winsteps.com. Author: John Michael Linacre

Facets Rasch measurement software. Buy for $149. & site licenses. Freeware student/evaluation Minifac download
Winsteps Rasch measurement software. Buy for $149. & site licenses. Freeware student/evaluation Ministep download

Rasch Books and Publications: Winsteps and Facets
Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland
Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes Rasch Models for Solving Measurement Problems (Facets), George Engelhard, Jr. & Jue Wang Statistical Analyses for Language Testers (Facets), Rita Green Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M
Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind Rasch Measurement: Applications, Khine Winsteps Tutorials - free
Facets Tutorials - free
Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan
Other Rasch-Related Resources: Rasch Measurement YouTube Channel
Rasch Measurement Transactions & Rasch Measurement research papers - free An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse Rasch Measurement Theory Analysis in R, Wind, Hua Applying the Rasch Model in Social Sciences Using R, Lamprianou Journal of Applied Measurement
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Rasch Models for Measurement, David Andrich Constructing Measures, Mark Wilson Best Test Design - free, Wright & Stone
Rating Scale Analysis - free, Wright & Masters
Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias Diseño de Mejores Pruebas - free, Spanish Best Test Design A Course in Rasch Measurement Theory, Andrich, Marais Rasch Models in Health, Christensen, Kreiner, Mesba Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen
As an Amazon Associate I earn from qualifying purchases. This does not change what you pay.

facebook Forum: Rasch Measurement Forum to discuss any Rasch-related topic

To receive News Emails about Winsteps and Facets by subscribing to the Winsteps.com email list,
enter your email address here:

I want to Subscribe: & click below
I want to Unsubscribe: & click below

Please set your SPAM filter to accept emails from Winsteps.com
The Winsteps.com email list is only used to email information about Winsteps, Facets and associated Rasch Measurement activities. Your email address is not shared with third-parties. Every email sent from the list includes the option to unsubscribe.

Questions, Suggestions? Want to update Winsteps or Facets? Please email Mike Linacre, author of Winsteps mike@winsteps.com


State-of-the-art : single-user and site licenses : free student/evaluation versions : download immediately : instructional PDFs : user forum : assistance by email : bugs fixed fast : free update eligibility : backwards compatible : money back if not satisfied
 
Rasch, Winsteps, Facets online Tutorials


 

 
Coming Rasch-related Events
May 17 - June 21, 2024, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 12 - 14, 2024, Wed.-Fri. 1st Scandinavian Applied Measurement Conference, Kristianstad University, Kristianstad, Sweden http://www.hkr.se/samc2024
June 21 - July 19, 2024, Fri.-Fri. On-line workshop: Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
Aug. 5 - Aug. 6, 2024, Fri.-Fri. 2024 Inaugural Conference of the Society for the Study of Measurement (Berkeley, CA), Call for Proposals
Aug. 9 - Sept. 6, 2024, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
Oct. 4 - Nov. 8, 2024, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Jan. 17 - Feb. 21, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
May 16 - June 20, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 20 - July 18, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com
Oct. 3 - Nov. 7, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com

 

 

Our current URL is www.winsteps.com

Winsteps® is a registered trademark