﻿ Decimal, percentage and continuous data

# Decimal, percentage and continuous data

Winsteps analyzes ordinal data, not decimal data. Typical decimal data is over-precise. Its numerical precision is greater than its substantive precision. Example: I can measure and report my weight to the nearest gram, but my "true" weight has a precision of about 500 grams.

A solution to this is to discover the precision in the data empirically.

1. Dichotomize the data for each item around the median decimal value into 0 = below median, 1= above median

2. Analyze those data.

3. If the analysis makes sense, then dichotomize each subset of the data again, so that it is now scored 0,1, 2,3

4. Analyze those data.

5. If the analysis makes sense, then dichotomize each subset of the data again, so that it is now scored 0,1, 2,3,  4,5, 6,7

6. Analyze those data.

7. If .... (and so on).

From a Rasch perspective, the relationship between a continuous variable (such as time to run 100 meters) and a Rasch latent variable (such as physical fitness)  is always non-linear. Since we do not know the form of the non-linear transformation, we chunk the continuous variable into meaningful intervals, so that the difference between the means of the intervals is greater than the background noise. With percents, the intervals are rarely smaller than 10% wide, with special intervals for 0% and 100%. These chunked data can than be analyzed with a rating-scale or partial-credit model. We can then transform back to continuous-looking output using the item characteristic curve or the test characteristic curve.

Winsteps analyzes ordinal data expressed as integers, cardinal numbers, in the range 0-254, i.e., 255 ordered categories.

Example: The data are reported as 1.0, 1.25, 1.5, 1.75, 2.0, 2.5, ......

Winsteps only accepts integer data, so multiply all the ratings by 4.

If you want the score reports to look correct, then please use IWEIGHT=

IWEIGHT=*

1-100 of items you have  0.25   ; 100 is the number of items

*

Percentage and 0-100 observations:

Observations may be presented for Rasch analysis in the form of percentages in the range 0-100. These are straightforward computationally but are often awkward in other respects.

A typical specification is:

XWIDE = 3

CODES = "  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19+

+ 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39+

+ 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59+

+ 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79+

+ 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99100"

STKEEP = Yes ; to keep intermediate unobserved categories

Since it is unlikely that all percentages will be observed, the rating (or partial credit) scale structure will be difficult to estimate. Since it is even more unlikely that there will be at least 10 observations of each percentage value, the structure will be unstable across similar datasets.

It is usually better from a measurement perspective (increased person "test" reliability, increased stability) to collapse percentages into shorter rating (or partial credit) scales, e.g., 0-10, using IREFER= and IVALUE= or NEWSCORE=.

Alternatively, model the 0-100 observations as 100 binomial trials. This imposes a structure on the rating scale so that unobserved categories are of no concern. This can be done by anchoring the Rasch-Andrich thresholds at the values: Fj = C * ln(j/(101-j)), or more generally, Fj = C * ln(j / (m-j+1)) where the range of observations is 0-m. Adjust the value of the constant C so that the average mean-square is 1.0.

Decimal observations:

When observations are reported in fractional or decimal form, e.g., 2.5 or 3.9, multiply them by suitable multipliers, e.g., 2 or 10, to bring them into exact integer form.

Specify STKEEP=NO, if the range of observed integer categories includes integers that cannot be observed.

Continuous and percentage observations:

These are of two forms:

(a) Very rarely, observations are already in the additive, continuous form of a Rasch variable. Since these are in the form of the measures produced by Winsteps, they can be compared and combined with Rasch measures using standard statistical techniques, in the same way that weight and height are analyzed.

(b) Observations are continuous or percentages, but they are not (or may not be) additive in the local Rasch context. Examples are "time to perform a task", "weight lifted with the left hand". Though time and weight are reported in additive units, e.g., seconds and grams, their implications in the specific context is unlikely to be additive. "Continuous" data are an illusion. All data are discrete at some level. A major difficulty with continuous data is determining the precision of the data for this application. This indicates how big a change in the observed data constitutes a meaningful difference. For instance, time measured to .001 seconds is statistically meaningless in the Le Mans 24-hour car race - even though it may decide the winner!

To analyze these forms of data, segment them into ranges of observably different values. Identify each segment with a category number, and analyze these categories as rating scales. It is best to start with a few, very wide segments. If these produce good fit, then narrow the segments until no more statistical improvement is evident. The general principle is: if the data analysis is successful when the data are stratified into a few levels, then it may be successful if the data are stratified into more levels. If the analysis is not successful at a few levels, then more levels will merely be more chaotic. Signs of increasing chaos are increasing misfit, categories "average measures" no longer advancing, and a reduction in the sample "test" reliability.

May I suggest that you start by stratifying your data into 2 levels? (You can use Excel to do this.) Then analyze the resulting the 2 category data. Is a meaningful variable constructed? If the analysis is successful (e.g., average measures per category advance with reasonable fit and sample reliability), you could try stratifying into more levels.

Example 1: My 20 items are in an Excel file: 0-0.5 for dichotomies and 0-0.5-1.0 for partial credit items. What should I do?

i) Winsteps only analyzes integers, so multiply all the items by 2 in Excel.

ii) In order for the raw scores to look right, in Winsteps
IWEIGHT=*
1-20 0.5
*
Do this to verify that the data are being transformed correctly

iii) But, since in the original data, a dichotomous item is 0 - 0.5, instead of the usual 0 - 1,

in order for the Rasch measures to have the correct standard errors and the standardized fit statistics to be valid,
omit IWEIGHT=

iv) Yes, for these data, the partial credit model, which will also process the dichotomous items correctly:
ISGROUPS=0

Example 2: My dataset contains negative numbers such as "-1.60", as well as positive numbers such as "2.43". The range of potential responses is -100.00 to +100.00.

Winsteps expects integer data, where each advancing integer indicates one qualitatively higher level of performance (or whatever) on the latent variable. The maximum number of levels is 0-254. There are numerous ways in which data can be recoded. On is to use Excel. Read your data file into Excel. Its "Text to columns" feature in the "Data" menu may be useful. Then apply a transformation to the responses, for instance,

recoded response = integer ( (observed response - minimum response)*100 / (maximum response - minimum response) )

This yields integer data in the range 0-100, i.e., 101 levels. Set the Excel column width, and "Save As" the Excel file in ".prn" (formatted text) format. Or you can do the same thing in SAS or SPSS and then use the Winsteps SAS/SPSS menu.

Example 3: We want to construct Rasch measures from the values of the indicators to produce a 'ruler'.

There are two approaches to this problem, depending on the meaning of the values:

1. If you consider that the values of the indicators are equivalent to "item difficulties", in the Rasch sense, then it is a matter of finding out their relationship to logits. For this, one needs some ordinal observational data of the data. Calibrate the observational data, then cross plot the resulting indicator measures against their reference values. The best-fit line or simple curve gives the reference value to logit conversion.

or 2. If the values are the observations (like weights and heights), then it is a matter of transforming them into ordinal values, and then performing a Rasch analysis on them. The approach is to initially score the values dichotomously high-low (1-0) and see if the analysis makes sense. If so, stratify the values into 4: 3-2-1-0. If the results still make good sense, then into 6, then into 8, then into 10. At about this point, the random noise in the data will start to overwhelm the categorization so that their will be empty categories and many "category average measures" out of sequence. So go back to the last good analysis. The model ICC will give the relationship between values and logits.

Help for Winsteps Rasch Measurement Software: www.winsteps.com. Author: John Michael Linacre

 Forum Rasch Measurement Forum to discuss any Rasch-related topic

Rasch Publications
Rasch Measurement Transactions (free, online) Rasch Measurement research papers (free, online) Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Applying the Rasch Model 3rd. Ed., Bond & Fox Best Test Design, Wright & Stone
Rating Scale Analysis, Wright & Masters Introduction to Rasch Measurement, E. Smith & R. Smith Introduction to Many-Facet Rasch Measurement, Thomas Eckes Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments, George Engelhard, Jr. & Stefanie Wind Statistical Analyses for Language Testers, Rita Green
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Journal of Applied Measurement Rasch models for measurement, David Andrich Constructing Measures, Mark Wilson Rasch Analysis in the Human Sciences, Boone, Stave, Yale
in Spanish: Análisis de Rasch para todos, Agustín Tristán Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez
Winsteps Tutorials Facets Tutorials Rasch Discussion Groups

Coming Rasch-related Events
April 10-12, 2018, Tues.-Thurs. Rasch Conference: IOMW, New York, NY, www.iomw.org
April 13-17, 2018, Fri.-Tues. AERA, New York, NY, www.aera.net
May 22 - 24, 2018, Tues.-Thur. EALTA 2018 pre-conference workshop (Introduction to Rasch measurement using WINSTEPS and FACETS, Thomas Eckes & Frank Weiss-Motz), https://ealta2018.testdaf.de
May 25 - June 22, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 27 - 29, 2018, Wed.-Fri. Measurement at the Crossroads: History, philosophy and sociology of measurement, Paris, France., https://measurement2018.sciencesconf.org
June 29 - July 27, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
July 25 - July 27, 2018, Wed.-Fri. Pacific-Rim Objective Measurement Symposium (PROMS), (Preconference workshops July 23-24, 2018) Fudan University, Shanghai, China "Applying Rasch Measurement in Language Assessment and across the Human Sciences" www.promsociety.org
Aug. 10 - Sept. 7, 2018, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
Sept. 3 - 6, 2018, Mon.-Thurs. IMEKO World Congress, Belfast, Northern Ireland www.imeko2018.org
Oct. 12 - Nov. 9, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com

Our current URL is www.winsteps.com