FORMAT= reformat data

Enables you to process awkwardly formatted data! But MFORMS= is easier


FORMAT= is rarely needed when there is one data line per person.


Place the data in a separate DATA= file, then the Winsteps screen file will show the first record before and after FORMAT=. The formatted data records are also shown from the Edit menu, Formatted Data=.


Control instructions to pick out every other character for 25 two-character responses, then a blank, and then the person label:





This displays on the Winsteps screen:


Opening: datafile.txt

Input Data Record before FORMAT=:

         1         2         3         4         5         6         7



01xx 1x1 10002000102020202000201010202000201000200ROSSNER, MARC DANIEL

Input Data Record after FORMAT=:

1x11102012222021122021020 L

^I                      ^N^P


^I is Item1= column

^N is the last item according to NI=

^P is Name1= column


FORMAT= enables you to reformat one or more data record lines into one new line in which all the component parts of the person information are in one person-id field, and all the responses are put together into one continuous item-response string. A FORMAT= statement is required if

1) each person's responses take up several lines in your data file.

2) if the length of a single line in your data file is more than 10000 characters.

3) the person-id field or the item responses are not in one continuous string of characters.

4) you want to rearrange the order of your items in your data record, to pick out sub-tests, or to move a set of connected forms into one complete matrix.

5) you only want to analyze the responses of every second, or nth, person.


FORMAT= contains up to 512 characters of reformatting instructions, contained within (..), which follow special rules. Instructions are:



read in n characters starting with the current column, and then advance to the next column after them. Processing starts from column 1 of the first line, so that 5A reads in 5 characters and advances to the sixth column.


means skip over n columns. E.g. 5X means bypass this column and the next 4 columns.


go to column c. T20 means get the next character from column 20.
T55 means "tab" to column 55, not "tab" passed 55 columns (which is TR55).


go c columns to the left. TL20 means get the next character the column which is 20 columns to the left of the current position.


go c columns to the right. TR20 means get the next character the column which is 20 columns to the right of the current position.


means go to column 1 of the next line in your data file.


repeat the string of instructions within the () exactly n times.


a comma is used to separate the instructions.


Set XWIDE=2 and you can reformat your data from original 1 or 2 column entries. Your data will all be analyzed as XWIDE=2. Then:


read in n pairs of characters starting with the current column into n 2-character fields of the formatted record. (For responses with a width of 2 columns.)


read in n 1-character columns, starting with the current column, into n 2-character fields of the formatted record.


Always use nA1 for person-id information. Use nA1 for responses entered with a width of 1-character when there are also 2-character responses to be analyzed. When responses in 1-character format are converted into 2-character field format (compatible with XWIDE=2), the 1-character response is placed in the first, left, character position of the 2-character field, and the second, right, character position of the field is left blank. For example, the 1-character code of "A" becomes the 2-character field "A ". Valid 1-character responses of "A", "B", "C", "D" must be indicated by CODES="A B C D " with a blank following each letter.


ITEM1= must be the column number of the first item response in the formatted record created by the FORMAT= statement. NAME1= must be the column number of the first character of the person-id in the formatted record.


Example 1: Each person's data record file is 80 characters long and takes up one line in your data file. The person-id is in columns 61-80. The 56 item responses are in columns 5-60. Codes are "A", "B", "C", "D". No FORMAT= is needed. Data look like:



 Without FORMAT=

   XWIDE=1 response width (the standard)

   ITEM1=5 start of item responses

   NI=56  number of items

   NAME1=61 start of name

   NAMLEN=20 length of name

   CODES=ABCD valid response codes



Reformatted record will look like:


   XWIDE=1 response width (the standard)

   FORMAT=(4X,56A,20A) skip unused characters

   ITEM1=1 start of item responses

   NI=56  number of items

   NAME1=57 start of name

   NAMLEN=20 length of name

   CODES=ABCD valid response codes


Example 2: Each data record is one line of 80 characters. The person-id is in columns 61-80. The 28 item responses are in columns 5-60, each 2 characters wide. Codes are " A", " B", " C", " D". No FORMAT= is necessary. Data look like:

xxxx C D B A C B C A A D D D D C D D C A C D C B A C C B A CZarathrustra-Xerxes

 Without FORMAT=

   XWIDE=2 response width

   ITEM1=5 start of item responses

   NI=28  number of items

   NAME1=61 start of name

   NAMLEN=20 length of name

   CODES=" A B C D" valid response codes



Columns of reformatted record:


 C D B A C B C A A D D D D C D D C A C D C B A C C B A CZarathrustra-Xerxes

   XWIDE=2 response width

   FORMAT=(4X,28A2,20A1) skip unused characters

   ITEM1=1 start of item responses in formatted record

   NI=28  number of items

   NAME1=29 start of name in "columns"

   NAMLEN=20 length of name

   CODES=" A B C D" valid response codes


Example 3: Each person's data record is 80 characters long and takes one line in your data file. Person-id is in columns 61-80. 30 1-character item responses, "A", "B", "C" or "D", are in columns 5-34, 13 2-character item responses, "01", "02" or "99", are in 35-60.


becomes on reformatting:





   XWIDE=2 analyzed response width

   FORMAT=(4X,30A1,13A2,20A1) skip unused

   ITEM1=1 start of item responses in formatted record

   NI=43  number of items

   NAME1=44 start of name

   NAMLEN=20 length of name

   CODES="A B C D 010299" valid responses

     ^ 1-character code followed by blank


Example 4: The person-id is 10 columns wide in columns 15-24 and the 50 1-column item responses, "A", "B", "C", "D", are in columns 4000-4019, then in 4021-50. Data look like:


becomes on reformatting:



   NAME1=1 start of person name in formatted record

   NAMLEN=10 length of name (automatic)

   ITEM1=11 start of items in formatted record

   NI=50  50 item responses

   CODES=ABCD valid response codes


Example 5: There are five records or lines in your data file per person. There are 100 items. Items 1-20 are in columns 25-44 of first record; items 21-40 are in columns 25-44 of second record, etc. The 10 character person-id is in columns 51-60 of the last (fifth) record. Codes are "A", "B", "C", "D". Data look like:











   ITEM1=1 start of item responses

   NI=100  number of item responses

   NAME1=101 start of person name in formatted record

   NAMLEN=10 length of person name

   CODES=ABCD valid response codes


Example 6: There are three lines per person. In the first line from columns 31 to 50 are 10 item responses, each 2 columns wide. Person-id is in the second line in columns 5 to 17. The third line is to be skipped. Codes are "A ", "B ", "C ", "D ". Data look like:

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx A C B D A D C B A Dxxxxxxxx







 A C B D A D C B A DJoseph-Carlos



   ITEM1=1 start of item responses

   NI=10  number of items

   XWIDE=2 2 columns per response

   NAME1=11 starting "A" of person name

   NAMLEN=13 length of person name

   CODES='A B C D ' valid response codes


  If the third line isn't skipped, format a redundant extra column in the skipped last line. Replace the first control variable in this with:

   FORMAT=(T31,10A2,/,T5,13A1,/,A1) last A1 unused


Example 7: Pseudo-random data selection

To skip every other record, use (for most situations):
FORMAT=(500A, /) ; skips every second record of two
FORMAT=(/, 500A) ; skips every first record of two


You have a file with 1,000 person records. This time you want to analyze every 10th record, beginning with the 3rd person in the file, i.e., skip two records, analyze one record, skip seven records, and so on. The data records are 500 characters long.

   XWIDE = 1

   FORMAT = (/,/,500A,/,/,/,/,/,/,/)


   XWIDE = 2

   FORMAT = (/,/,100A2,300A1,/,/,/,/,/,/,/) ; 100 2-character responses, 300 other columns


Example 8: Test A, in file EXAM10A.TXT, and TEST B, in EXAM10B.TXT, are both 20 item tests. They have 5 items in common, but the distractors are not necessarily in the same order. The responses must be scored on an individual test basis. Also the validity of each test is to be examined separately. Then one combined analysis is wanted to equate the tests and obtain bankable item difficulties. For each file of original test responses, the person information is in columns 1-25, the item responses in 41-60.


The combined data file specified in EXAM10C.TXT, is to be in RFILE= format. It contains


Person information 30 characters (always)

Item responses           Columns 31-64


The identification of the common items is:

Test Item Number (=Location in item string)















2, 4-6, 10-20









1, 3, 7-10, 12-20


I. From Test A, make a response (RFILE=) file rearranging the items with FORMAT=.


; This file is EXAM10A.TXT


TITLE="Analysis of Test A"

RFILE=EXAM10AR.TXT ; The constructed response file for Test A



ITEM1=26  ; Items start in column 26 of reformatted record

CODES=ABCD#  ; Beware of blanks meaning wrong!

; Use your editor to convert all "wrong" blanks into another code, 

; e.g., #, so that they will be scored wrong and not ignored as missing.

KEYFRM=1  ; Key in data record format


Key 1 Record                            CCBDACABDADCBDCABBCA

BANK 1   TEST A 3 ; first item name


BANK 20  TEST A 20


Person 01 A                             BDABCDBDDACDBCACBDBA


Person 12 A                             BADCACADCDABDDDCBACA


The RFILE= file, EXAM10AR.TXT, is:


Person 01 A                   00001000010010001001

Person 02 A                   00000100001110100111


Person 12 A                   00100001100001001011


II. From Test B, make a response (RFILE=) file rearranging the items with FORMAT=. Responses unique to Test A are filled with 15 blank responses to dummy items.


; This file is EXAM10B.TXT


TITLE="Analysis of Test B"

RFILE=EXAM10BR.TXT ; The constructed response file for Test B



   ; Blanks are imported from an unused part of the data record to the right!

   ; T100 means "go beyond the end of the data record"

   ; 15A means "get 15 blank spaces"

ITEM1=26  ; Items start in column 26 of reformatted record

CODES=ABCD#  ; Beware of blanks meaning wrong!

KEYFRM=1  ; Key in data record format


Key 1 Record                            CDABCDBDABCADCBDBCAD



BANK 5   TEST B 11



BANK 20  TEST A 20



BANK 35  TEST B 20


Person 01 B                             BDABDDCDBBCCCCDAACBC


Person 12 B                             BADABBADCBADBDBBBBBB


The RFILE= file, EXAM10BR.TXT, is:


Person 01 B                   10111               010101001000100

Person 02 B                   00000               010000000001000


Person 11 B                   00010               001000000000100

Person 12 B                   00000               000101000101000


III. Analyze Test A's and Test B's RFILE='s together:


; This file is EXAM10C.TXT


TITLE="Analysis of Tests A & B (already scored)"


ITEM1=31  ; Items start in column 31 of RFILE=

CODES=01  ; Blanks mean "not in this test"

DATA=EXAM10AR.TXT+EXAM10BR.TXT ; Combine data files


; or, first, at the DOS prompt,


; then, in EXAM10C.TXT,



PFILE=EXAM10CP.TXT ; Person measures for combined tests

IFILE=EXAM10CI.TXT ; Item calibrations for combined tests

tfile=*  ; List of desired tables

3   ; Table 3.1 for summary statistics, 3.2, ...

10   ; Table 10 for item structure


PRCOMP=S  ; Principal components/contrast analysis with standardized residuals


BANK 1   TEST A 3 B 4


BANK 35  TEST B 20



Shortening FORMAT= statements

If the required FORMAT= statement exceeds 512 characters, consider using this technique:


Relocate an entire item response string, but use an IDFILE= to delete the duplicate items, i.e., replace them by blanks. E.g., for Test B, instead of

 FORMAT=(25A, T44,3A,T42,A,T51,A, T100,15A, 41,A,T43,A,T47,4A,T52,9A)



Put Test 2 as items 21-40 in columns 51 through 70:

 FORMAT=(25A, T44,3A,T42,A,T51,A, T100,15A, T41,20A)



Blank out (delete) the 5 duplicated items with an IDFILE= containing:




Help for Winsteps Rasch Measurement Software: Author: John Michael Linacre

For more information, contact or use the Contact Form

Facets Rasch measurement software. Buy for $149. & site licenses. Freeware student/evaluation download
Winsteps Rasch measurement software. Buy for $149. & site licenses. Freeware student/evaluation download

State-of-the-art : single-user and site licenses : free student/evaluation versions : download immediately : instructional PDFs : user forum : assistance by email : bugs fixed fast : free update eligibility : backwards compatible : money back if not satisfied
Rasch, Winsteps, Facets online Tutorials


Forum Rasch Measurement Forum to discuss any Rasch-related topic

Click here to add your email address to the Winsteps and Facets email list for notifications.

Click here to ask a question or make a suggestion about Winsteps and Facets software.

Rasch Publications
Rasch Measurement Transactions (free, online) Rasch Measurement research papers (free, online) Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Applying the Rasch Model 3rd. Ed., Bond & Fox Best Test Design, Wright & Stone
Rating Scale Analysis, Wright & Masters Introduction to Rasch Measurement, E. Smith & R. Smith Introduction to Many-Facet Rasch Measurement, Thomas Eckes Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments, George Engelhard, Jr. & Stefanie Wind Statistical Analyses for Language Testers, Rita Green
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Journal of Applied Measurement Rasch models for measurement, David Andrich Constructing Measures, Mark Wilson Rasch Analysis in the Human Sciences, Boone, Stave, Yale
in Spanish: Análisis de Rasch para todos, Agustín Tristán Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez
Winsteps Tutorials Facets Tutorials Rasch Discussion Groups



Coming Rasch-related Events
April 10-12, 2018, Tues.-Thurs. Rasch Conference: IOMW, New York, NY,
April 13-17, 2018, Fri.-Tues. AERA, New York, NY,
May 22 - 24, 2018, Tues.-Thur. EALTA 2018 pre-conference workshop (Introduction to Rasch measurement using WINSTEPS and FACETS, Thomas Eckes & Frank Weiss-Motz),
May 25 - June 22, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps),
June 27 - 29, 2018, Wed.-Fri. Measurement at the Crossroads: History, philosophy and sociology of measurement, Paris, France.,
June 29 - July 27, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps),
July 25 - July 27, 2018, Wed.-Fri. Pacific-Rim Objective Measurement Symposium (PROMS), (Preconference workshops July 23-24, 2018) Fudan University, Shanghai, China "Applying Rasch Measurement in Language Assessment and across the Human Sciences"
Aug. 10 - Sept. 7, 2018, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets),
Sept. 3 - 6, 2018, Mon.-Thurs. IMEKO World Congress, Belfast, Northern Ireland
Oct. 12 - Nov. 9, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps),



Our current URL is

Winsteps® is a registered trademark

Mike L.'s Wellness Report: Effective weight loss program? The Mediterranean Diet, especially the M3 version